[
  {
    "path": ".claude/20260307_done.md",
    "content": "# 1、按 README.md 更新其他语种 readme\n\n# 2、更新 .github/workflows/ 的 release 流程\n\n## 触发机制：\n\n**upstream**（TeamWiseflow 正式仓库）每次合并 PR 后通过 github actions 自动更新版本号并触发 release 打包发布\n\n## 具体工作机制\n\n分别从 https://github.com/openclaw/openclaw 和 https://github.com/TeamWiseFlow/openclaw_for_business 拉取最新代码，拉取后按如下结构放置：\n\n```\nopenclaw_for_business/\n├── addons/\n│   └── wiseflow/   # 本项目代码仓内的 wiseflow/ 注意：不是整个项目目录\n└── openclaw/\n    └── \n```\n\n使用 github action 分别在最新的 ubuntu24.04、macos-latest 两个系统上进行端到端完整测试，保证执行`openclaw_for_business/scripts/reinstall-daemon.sh`脚本没问题后，直接连同 openclaw_for_business 和 openclaw 代码，保持上面的放置结构，打包为一个 zip 压缩包，发布到本代码仓的 release"
  },
  {
    "path": ".github/workflows/ci.yml",
    "content": "name: CI\n\non:\n  pull_request:\n    branches: [master]\n    types: [opened, synchronize, reopened]  # 明确排除 closed\n\n# 同一 PR/分支有新 commit 时，自动取消正在运行的旧任务\nconcurrency:\n  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}\n  cancel-in-progress: true\n\njobs:\n  test:\n    strategy:\n      matrix:\n        os: [ubuntu-24.04, macos-latest]\n      fail-fast: false\n    runs-on: ${{ matrix.os }}\n\n    steps:\n      - name: Checkout\n        uses: actions/checkout@v4\n\n      - name: Setup Node.js\n        uses: actions/setup-node@v4\n        with:\n          node-version: '20'\n\n      - name: Install pnpm\n        uses: pnpm/action-setup@v4\n        with:\n          version: latest\n\n      - name: Read pinned openclaw version\n        id: pin\n        run: |\n          source openclaw.version\n          echo \"commit=$OPENCLAW_COMMIT\" >> \"$GITHUB_OUTPUT\"\n          echo \"version=$OPENCLAW_VERSION\" >> \"$GITHUB_OUTPUT\"\n\n      - name: Clone openclaw at pinned commit\n        run: |\n          git init openclaw\n          git -C openclaw remote add origin https://github.com/openclaw/openclaw.git\n          git -C openclaw fetch --depth=1 origin ${{ steps.pin.outputs.commit }}\n          git -C openclaw checkout FETCH_HEAD\n\n      - name: Clone openclaw_for_business\n        run: git clone --depth=1 https://github.com/TeamWiseFlow/openclaw_for_business.git openclaw_for_business\n\n      - name: Set up addon directory structure\n        run: |\n          cp -r wiseflow openclaw_for_business/addons/wiseflow\n          cp -r openclaw openclaw_for_business/openclaw\n\n      # Run setup-crew.sh + apply-addons.sh separately.\n      # We intentionally skip the `pnpm openclaw daemon install` step that\n      # reinstall-daemon.sh would also execute: daemon installation requires\n      # a real user session (systemd on Linux, launchd on macOS) and cannot\n      # be meaningfully tested in a headless CI runner.\n      - name: Run setup-crew.sh\n        run: bash scripts/setup-crew.sh\n        working-directory: openclaw_for_business\n\n      - name: Run apply-addons.sh\n        run: bash scripts/apply-addons.sh\n        working-directory: openclaw_for_business\n"
  },
  {
    "path": ".github/workflows/release.yml",
    "content": "name: Auto Release\n\non:\n  pull_request_target:\n    types: [closed]\n    branches: [master]\n  workflow_dispatch:\n    inputs:\n      bump_type:\n        description: 'Version bump type'\n        required: false\n        default: 'patch'\n        type: choice\n        options:\n          - patch\n          - minor\n          - major\n\npermissions:\n  contents: write\n\n# 防止多个 PR 同时 merge 时并发触发重复 release\nconcurrency:\n  group: release\n  cancel-in-progress: false\n\njobs:\n  release:\n    # CI 已在 PR 期间验证过，此处直接做版本 bump + 打包发布\n    if: github.event.pull_request.merged == true || github.event_name == 'workflow_dispatch'\n    runs-on: ubuntu-latest\n\n    steps:\n      - name: Checkout\n        uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n          token: ${{ secrets.RELEASE_TOKEN || secrets.GITHUB_TOKEN }}\n\n      - name: Determine bump type from PR labels\n        id: bump\n        run: |\n          if [ \"${{ github.event_name }}\" = \"workflow_dispatch\" ]; then\n            echo \"type=${{ inputs.bump_type }}\" >> \"$GITHUB_OUTPUT\"\n          else\n            LABELS='${{ toJSON(github.event.pull_request.labels.*.name) }}'\n            if echo \"$LABELS\" | grep -q '\"major\"'; then\n              echo \"type=major\" >> \"$GITHUB_OUTPUT\"\n            elif echo \"$LABELS\" | grep -q '\"minor\"'; then\n              echo \"type=minor\" >> \"$GITHUB_OUTPUT\"\n            else\n              echo \"type=patch\" >> \"$GITHUB_OUTPUT\"\n            fi\n          fi\n\n      - name: Calculate new version\n        id: version\n        run: |\n          CURRENT=$(cat version | tr -d '[:space:]')\n          NUM=${CURRENT#v}\n\n          IFS='.' read -r MAJOR MINOR PATCH <<< \"$NUM\"\n          MAJOR=${MAJOR:-0}\n          MINOR=${MINOR:-0}\n          PATCH=${PATCH:-0}\n\n          BUMP=\"${{ steps.bump.outputs.type }}\"\n          if [ \"$BUMP\" = \"major\" ]; then\n            MAJOR=$((MAJOR + 1))\n            MINOR=0\n            PATCH=0\n          elif [ \"$BUMP\" = \"minor\" ]; then\n            MINOR=$((MINOR + 1))\n            PATCH=0\n          else\n            PATCH=$((PATCH + 1))\n            # Auto-carry: patch 累积到 10 时自动晋升 minor\n            if [ \"$PATCH\" -ge 10 ]; then\n              MINOR=$((MINOR + 1))\n              PATCH=0\n            fi\n          fi\n\n          NEW_VERSION=\"v${MAJOR}.${MINOR}.${PATCH}\"\n          echo \"new=$NEW_VERSION\" >> \"$GITHUB_OUTPUT\"\n          echo \"New version: $NEW_VERSION\"\n\n      - name: Update version file\n        run: echo \"${{ steps.version.outputs.new }}\" > version\n\n      - name: Commit and tag\n        run: |\n          git config user.name \"github-actions[bot]\"\n          git config user.email \"github-actions[bot]@users.noreply.github.com\"\n          git add version\n          git commit -m \"chore: bump version to ${{ steps.version.outputs.new }} [skip ci]\"\n          git tag \"${{ steps.version.outputs.new }}\"\n          git push origin master --tags\n\n      - name: Read pinned openclaw version\n        id: pin\n        run: |\n          source openclaw.version\n          echo \"commit=$OPENCLAW_COMMIT\" >> \"$GITHUB_OUTPUT\"\n          echo \"version=$OPENCLAW_VERSION\" >> \"$GITHUB_OUTPUT\"\n\n      - name: Clone openclaw at pinned commit\n        run: |\n          git init openclaw\n          git -C openclaw remote add origin https://github.com/openclaw/openclaw.git\n          git -C openclaw fetch --depth=1 origin ${{ steps.pin.outputs.commit }}\n          git -C openclaw checkout FETCH_HEAD\n\n      - name: Clone openclaw_for_business\n        run: git clone --depth=1 https://github.com/TeamWiseFlow/openclaw_for_business.git openclaw_for_business\n\n      - name: Set up release directory structure\n        run: |\n          cp -r wiseflow openclaw_for_business/addons/wiseflow\n          cp -r openclaw openclaw_for_business/openclaw\n\n      - name: Package release\n        run: |\n          # 保留 openclaw/.git：apply-addons.sh 中 git apply --3way 依赖 git 仓库上下文\n          # 仅删除其他 .git 目录（wiseflow 本身、openclaw_for_business 等）\n          find openclaw_for_business -type d -name \".git\" \\\n            ! -path \"*/openclaw/.git\" \\\n            -exec rm -rf {} + 2>/dev/null || true\n          zip -r \"wiseflow-${{ steps.version.outputs.new }}.zip\" openclaw_for_business\n\n      - name: Create GitHub Release\n        env:\n          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n        run: |\n          gh release create \"${{ steps.version.outputs.new }}\" \\\n            \"wiseflow-${{ steps.version.outputs.new }}.zip\" \\\n            --title \"${{ steps.version.outputs.new }}\" \\\n            --generate-notes\n"
  },
  {
    "path": ".gitignore",
    "content": "# node\nnode_modules/\npackage-lock.json\n\n# default ignore\n/shelf/\n/workspace.xml\n.DS_Store\n.idea/\n__pycache__\n.env\n.venv/\n\n# temporary files\n*.tmp\n*.log\n*.pyc\n*.pyo\n*.pyd\n__pycache__/\n*.so\n.Python\npatchright/\npatchright-v*/\ntests/openclaw_for_business/\ntests/openclaw/\nopenclaw/\nopenclaw_for_business/\napply-addons.sh\n"
  },
  {
    "path": "CHANGELOG.md",
    "content": "# v5.0\n\nupgrage workflow to Agent!\n\n# v4.32\n- bug fix；\n\n- import error\\can not work when use rss souces only.\n\n- update patchright to 1.57.2\n\n- clean useless code\n\n# v4.3.1\n\n- 后端新增 info_stat 统计接口，并补齐 user_notify、user_prompt、ws_ping 等前端交互相关接口。\n\n  Added info_stat statistics endpoint and completed frontend interaction endpoints such as user_notify, user_prompt, and ws_ping.\n\n- read_info 参数与 task time_slots 枚举同步为当前实现。\n\n  Synced read_info parameters and task time_slots enum with the current implementation.\n\n- 后端接口文档更新，移除已弃用的 mc_backup_accounts CRUD 说明。\n\n  Updated backend API docs and removed deprecated mc_backup_accounts CRUD descriptions.\n\n# v4.30\n\n- 升级为与 pro 版本一样的架构，同时具有一样的 api，可无缝共享 [wiseflow+](https://github.com/TeamWiseFlow/wiseflow-plus) 生态！\n\n  Upgraded to the same architecture as the pro version, with the same api, seamlessly sharing the [wiseflow+](https://github.com/TeamWiseFlow/wiseflow-plus) ecosystem!\n\n# v4.2\n\n- 全新的网页爬取方案，使用 patchright 直连本地用户真实浏览器，从而实现更加强大的反爬虫伪装能力，以及提供用户数据持久化留存等特性；\n\n  Brand new web crawling solution: uses patchright to directly connect to the user's real local browser, providing much stronger anti-crawling disguise capabilities and features like persistent user data storage.\n\n- 配套提供预登录、清除、深度清除脚本\n\n  Provided supporting scripts for pre-login, cleanup, and deep cleanup.\n\n- 大幅简化 web crawler相关的 config\n\n  Greatly simplified web crawler-related configuration.\n\n- 新增了proxy方案（支持直连提供商服务器，动态获取，本地缓存）\n\n  Added a new proxy solution (supports direct connection to provider servers, dynamic acquisition, and local caching).\n\n- 整合 Crawler4ai script 方案，提供网页操作能力\n\n  Integrated Crawler4ai script solution, enabling web page operation capabilities.\n\n- 重构搜索引擎方案，适配新的爬取方案并修复一些累积问题\n\n  Refactored search engine solution to adapt to the new crawling approach and fixed some accumulated issues.\n\n- 升级 docker 部署方案，适配全新的打包 work flow。\n\n  Upgraded Docker deployment solution to fit the brand new packaging workflow.\n\n\n# v4.1\n\n- 通用llm提取支持设定 role 和 purpose，从而实现更加精准的提取\n\n  Universal LLM extraction supports setting role and purpose, enabling more precise extraction\n\n- 社交平台信源增加查找创作者详情的功能\n\n  Added functionality to search for creator details in social media platform sources\n\n- 增加自定义精准搜索功能（自定义 info 提取字段）\n\n  Added custom precision search functionality (custom info extraction fields)\n\n- 可以为关注点指定搜索源，目前支持 bing、github、arxiv、ebay 四个源，并且全部使用平台原生接口，无需额外申请并配置第三方 key\n\n  Can specify search sources for focus points, currently supporting four sources: bing, github, arxiv, ebay, all using platform native interfaces without requiring additional third-party key applications and configurations\n\n- 优化的缓存以及缓存遗忘机制\n\n  Optimized caching and cache forgetting mechanisms\n\n- 修复快手平台搜索结果为空时的错误处理\n\n  Fixed error handling when Kuaishou platform search results are empty\n\n# v4.0\n\n- 深度重构 Crawl4ai（0.6.3）和 MediaCrawler， 并整合引入 Nodriver，大幅提升获取能力，支持社交平台内容获取（4.0版本提供对微博和快手平台的支持）；\n\n  Deeply refactored Crawl4ai (0.6.3) and MediaCrawler, integrated Nodriver, significantly enhanced content acquisition capabilities, supporting social media platform content retrieval (version 4.0 provides support for Weibo and Kuaishou platforms);\n\n- 全新的架构，混合使用异步和线程池，大大提升处理效率（同时降低内存消耗）；\n\n  New architecture utilizing a hybrid approach of async and thread pools, greatly improving processing efficiency (while reducing memory consumption);\n\n- 继承了 Crawl4ai 0.6.3 版本的 dispacher 能力，提供更精细的内存管理能力；\n\n  Inherited the dispatcher capabilities from Crawl4ai 0.6.3 version, providing more refined memory management capabilities;\n\n- 深度整合了 3.9 版本中的 Pre-Process 和 Crawl4ai 的 Markdown Generation流程， 规避了重复处理；\n\n  Deeply integrated the Pre-Process from version 3.9 and Crawl4ai's Markdown Generation process, avoiding duplicate processing;\n\n- 放弃了通过 pocketbase 的api 进行数据库操作，改为直接读写 sqlite 数据库，因此无需用户在 .env 中提供pocketbase的账密，也规避了登录过期导致数据库无法读写，从而产生大量日志的隐患；\n\n  Abandoned database operations through PocketBase API, switched to direct SQLite database read/write, eliminating the need for users to provide PocketBase credentials in .env, and avoiding the risk of database read/write failures due to login expiration that could generate excessive logs;\n\n- 优化 llm 处理策略，更加符合思考模型的特性；\n\n  Optimized LLM processing strategy to better align with the characteristics of thinking models;\n\n- 优化了对 RSS 信源的支持；\n\n  Enhanced support for RSS sources;\n\n- 优化了代码仓文件结构，更加清晰且符合当代 python 项目规范；\n\n  Optimized repository file structure, making it clearer and more compliant with contemporary Python project standards;\n\n- 改为使用 uv 进行依赖管理，并优化了 requirement.txt 文件；\n\n  Switched to using uv for dependency management and optimized the requirement.txt file;\n\n- 优化了启动脚本（提供提供 windows 版本），真正做到\"一键启动\"；\n\n  Optimized startup scripts (including Windows version), achieving true \"one-click startup\";\n\n- 优化了日志输出，增加 recorder 总结，并提供更精细化的日志输出控制。\n\n  Enhanced log output, added recorder summaries, and provided more granular log output control.\n\n\n# v3.9-patch3\n\n- 更改版本号命名规则\n\n  Change version number naming rules\n\n- 诸多累积修复\n\n  Numerous cumulative fixes\n\n# v0.3.9-patch2\n\n- 定制更改 crawl4ai 0.4.30 版本，以取得更好的性能\n\n  Modified crawl4ai version 0.4.30 for better performance\n\n- 相应的更改 core/requirements.txt\n\n  Corresponding changes to core/requirements.txt\n\n- 更改 prompt，但未在 qwen2.5-14b 模型上发现改进\n\n  Modified the prompt, but no improvements were found on the qwen2.5-14b model\n\n\n# V0.3.9\n\n- 适配 Crawl4ai 0.4.248 版本，优化了性能\n\n  Adapt to Crawl4ai 0.4.248 version, optimized performance\n\n- 累积 bug 修复\n\n  Cumulative bug fixes\n\n- 增加 docker 运行方案（感谢 @braumye 贡献）\n\n  Added docker running solution (thanks to @braumye for contributing)\n\n\n# V0.3.8\n\n- 增加对 RSS 信源的支持\n\n  add support for RSS source\n\n- 支持为关注点指定信源，并且可以为每个关注点增加搜索引擎作为信源\n\n  support to specify source for each focus point, and add search engine as source\n\n- 进一步优化信息提取策略（每次只处理一个关注点）\n\n  Further optimized information extraction strategy (processing one focus point at a time)\n\n- 优化入口逻辑，简化并合并启动方案 （感谢 @c469591 贡献windows版本启动脚本）\n\n  Optimized entry logic, simplified and merged startup solutions (thanks to @c469591 for contributing Windows startup script)\n\n\n# V0.3.7\n\n- 新增通过wxbot方案获取微信公众号订阅消息信源（不是很优雅，但已是目前能找到的最佳方案）\n  \n  Added WeChat Official Account subscription message source acquisition through wxbot solution (not very elegant, but currently the best solution available)\n\n- 升级适配 Crawl4ai 0.4.247 版本，\n\n  Upgraded to fit Crawl4ai 0.4.247 version,\n\n- 通过新增预处理流程以及全新设计的推荐链接提取策略，大幅提升信息抓取效果，现在7b 这样的小模型也能比较好的完成复杂关注点（explanation中包含时间、指标限制这种）的提取了。\n\n  Through the addition of a new pre-processing process and a completely redesigned recommended link extraction strategy, the information capture effect has been significantly improved, and now even small models like 7b can better complete the extraction of complex focus points (such as time and index limits in the explanation).\n\n- 提供自定义提取器接口，方便用户根据实际需求进行定制。\n\n  Provided a custom extractor interface to allow users to customize according to actual needs.\n\n- bug 修复以及其他改进（crawl4ai浏览器生命周期管理，异步 llm wrapper 等）（感谢 @tusik 贡献）\n\n  Bug fixes and other improvements (crawl4ai browser lifecycle management, asynchronous llm wrapper, etc.)\n\n  Thanks to @tusik for contributing\n\n# V0.3.6\n- 改用 Crawl4ai 作为底层爬虫框架，其实Crawl4ai 和 Crawlee 的获取效果差别不大，二者也都是基于 Playwright ，但 Crawl4ai 的 html2markdown 功能很实用，而这对llm 信息提取作用很大，另外 Crawl4ai 的架构也更加符合我的思路；\n\n  Switched to Crawl4ai as the underlying web crawling framework. Although Crawl4ai and Crawlee both rely on Playwright with similar fetching results, Crawl4ai's html2markdown feature is quite practical for LLM information extraction. Additionally, Crawl4ai's architecture better aligns with my design philosophy.\n\n- 在 Crawl4ai 的 html2markdown 基础上，增加了 deep scraper，进一步把页面的独立链接与正文进行区分，便于后一步 llm 的精准提取。由于html2markdown和deep scraper已经将原始网页数据做了很好的清理，极大降低了llm所受的干扰和误导，保证了最终结果的质量，同时也减少了不必要的 token 消耗；\n\n  Built upon Crawl4ai's html2markdown, we added a deep scraper to further differentiate standalone links from the main content, facilitating more precise LLM extraction. The preprocessing done by html2markdown and deep scraper significantly cleans up raw web data, minimizing interference and misleading information for LLMs, ensuring higher quality outcomes while reducing unnecessary token consumption.\n\n   *列表页面和文章页面的区分是所有爬虫类项目都头痛的地方，尤其是现代网页往往习惯在文章页面的侧边栏和底部增加大量推荐阅读，使得二者几乎不存在文本统计上的特征差异。*\n   *这一块我本来想用视觉大模型进行 layout 分析，但最终实现起来发现获取不受干扰的网页截图是一件会极大增加程序复杂度并降低处理效率的事情……*\n\n  *Distinguishing between list pages and article pages is a common challenge in web scraping projects, especially when modern webpages often include extensive recommended readings in sidebars and footers of articles, making it difficult to differentiate them through text statistics.*\n\n  *Initially, I considered using large visual models for layout analysis, but found that obtaining undistorted webpage screenshots greatly increases program complexity and reduces processing efficiency...*\n  \n- 重构了提取策略、llm 的 prompt 等；\n\n  Restructured extraction strategies and LLM prompts;\n\n  *有关 prompt 我想说的是，我理解好的 prompt 是清晰的工作流指导，每一步都足够明确，明确到很难犯错。但我不太相信过于复杂的 prompt 的价值，这个很难评估，如果你有更好的方案，欢迎提供 PR*\n\n   *Regarding prompts, I believe that a good prompt serves as clear workflow guidance, with each step being explicit enough to minimize errors. However, I am skeptical about the value of overly complex prompts, which are hard to evaluate. If you have better solutions, feel free to submit a PR.*\n\n- 引入视觉大模型，自动在提取前对高权重（目前由 Crawl4ai 评估权重）图片进行识别，并补充相关信息到页面文本中；\n\n  Introduced large visual models to automatically recognize high-weight images (currently evaluated by Crawl4ai) before extraction and append relevant information to the page text;\n\n- 继续减少 requirement.txt 的依赖项，目前不需要 json_repair了（实践中也发现让 llm 按 json 格式生成，还是会明显增加处理时间和失败率，因此我现在采用更简单的方式，同时增加对处理结果的后处理）\n\n  Continued to reduce dependencies in requirement.txt; json_repair is no longer needed (in practice, having LLMs generate JSON format still noticeably increases processing time and failure rates, so I now adopt a simpler approach with additional post-processing of results)\n\n- pb info 表单的结构做了小调整，增加了 web_title 和 reference 两项。\n\n  Made minor adjustments to the pb info form structure, adding web_title and reference fields.\n\n- @ourines 贡献了 install_pocketbase.sh 脚本\n\n  @ourines contributed the install_pocketbase.sh script\n\n- @ibaoger 贡献了 windows 下的pocketbase 安装脚本\n\n  @ibaoger contributed the pocketbase installation script for Windows\n\n- docker运行方案被暂时移除了，感觉大家用起来也不是很方便……\n\n  Docker running solution has been temporarily removed as it wasn't very convenient for users...\n\n# V0.3.5\n- 引入 Crawlee(playwrigt模块)，大幅提升通用爬取能力，适配实际项目场景；\n  \n  Introduce Crawlee (playwright module), significantly enhancing general crawling capabilities and adapting to real-world task;\n\n- 完全重写了信息提取模块，引入\"爬-查一体\"策略，你关注的才是你想要的；\n\n  Completely rewrote the information extraction module, introducing an \"integrated crawl-search\" strategy, focusing on what you care about;\n\n- 新策略下放弃了 gne、jieba 等模块，去除了安装包；\n\n  Under the new strategy, modules such as gne and jieba have been abandoned, reducing the installation package size;\n\n- 重写了 pocketbase 的表单结构；\n  \n  Rewrote the PocketBase form structure;\n\n- llm wrapper引入异步架构、自定义页面提取器规范优化（含 微信公众号文章提取优化）；\n\n  llm wrapper introduces asynchronous architecture, customized page extractor specifications optimization (including WeChat official account article extraction optimization);\n\n- 进一步简化部署操作步骤。\n\n  Further simplified deployment steps.\n"
  },
  {
    "path": "CLAUDE.md",
    "content": "# CLAUDE.md\n\nThis file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.\n\n## Project Overview\n\nWiseflow v5.x is an **OpenClaw_for_business add-on** that enhances browser automation with anti-detection capabilities for [openclaw](https://github.com/openclaw/openclaw). It replaces Playwright with Patchright (undetected fork) and adds tab recovery, agent skills, and anti-bot strategies. In future, it may add some extension/plugin to openclaw.\n\nThe distributable unit is the `wiseflow/` directory, which would be applied to OpenClaw by a script developed by another team: https://github.com/bigbrother666sh/openclaw_for_business/blob/main/scripts/apply-addons.sh.\n\nwe must keep working with both the latest openclaw project and openclaw_for_business project.\n\n**OpenClaw_for_business is also called \"OFB\" for short.**\n\n## OpenClaw_for_business Add-on Architecture\n\n### Three-Layer Add-on Loading\n\nOpenClaw_for_business's `apply-addons.sh` processes our add-ons in this order:\n\n1. **`overrides.sh`** — pnpm overrides that swap `playwright-core` → `patchright-core` at the package manager level. Controlled by `PATCHRIGHT_VERSION` env var (default: 1.57.0). Also patches documentation references.\n\n2. **`patches/*.patch`** — Git patches applied to OpenClaw source. Currently `001-browser-tab-recovery.patch` adds snapshot-based tab recovery when the target tab disappears mid-session.\n\n**must use scripts/generate-patch.sh to generate the final patch file**\n\n3. **`skills/*/SKILL.md`** — Agent skill definitions installed into OpenClaw's skill system. `browser-guide/SKILL.md` teaches the agent login wall handling, CAPTCHA strategies, lazy-load scrolling, paywall detection, and tab cleanup.\n\n### Key Files\n\n| Path | Purpose |\n|------|---------|\n| `wiseflow/addon.json` | Package manifest (name, version, openclaw dependency) |\n| `wiseflow/overrides.sh` | pnpm override script, receives `$ADDON_DIR` and `$OPENCLAW_DIR` |\n| `wiseflow/patches/001-browser-tab-recovery.patch` | Tab resilience patch for browser tool |\n| `wiseflow/skills/browser-guide/SKILL.md` | Agent browser best practices |\n| `tests/run-managed-tests.mjs` | Automated test suite (Node.js ESM) |\n| `docs/anti-detection-research.md` | Technical analysis of detection mechanisms |\n| `version` | Current version string (v5.0) |\n\n### Deployment Model\n\n```\nopenclaw_for_business/\n  addons/\n    wiseflow/          ← copy of wiseflow/ directory\n      addon.json\n      overrides.sh\n      patches/\n      skills/\n```\n\nInstall: copy `wiseflow/` → `<openclaw>/addons/wiseflow`, then restart OpenClaw.\n\n## Development Workflow\n\n### 远程仓库\n\n- **origin** → `git@github.com:bigbrother666sh/wiseflow.git`（个人开发仓库）\n- **upstream** → `git@github.com:TeamWiseFlow/wiseflow.git`（TeamWiseflow 正式发布仓库）\n\n### 开发流程与注意事项\n\n1. 默认在 `master` 分支上开发，按需创建功能分支\n2. 本项目是基于 openclaw 进行 patch，同时必须遵循 openclaw_for_business(OFB)的 add-on 加载机制。因此你应该保证在 代码仓根目录始终克隆一份来自 https://github.com/openclaw/openclaw 的代码，同时下载一份 https://github.com/bigbrother666sh/openclaw_for_business/blob/main/scripts/apply-addons.sh \n每次开发前都应该进行一次拉取，然后基于最新的 openclaw 代码进行开发，并保证最后的产出适配 apply-addons.sh\n3. 遵循 tdd（测试驱动开发）流程，每次开发之后必须进行完整测试\n4. 本项目建立在其他一些开源项目基础上，比如[patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright), 随着项目发展，你需要记录一份我们的依赖清单，对于每一个你都可以 clone 一份代码到项目根目录下，以便随时查看我们是否有必要跟着升级，但记得同步更新 .gitignore 文件, 避免混入提交\n5. 开发完成后推送到 **origin**（个人仓库)\n6. 阶段性成果通过 GitHub PR 从 origin 合并到 **upstream**（TeamWiseflow 正式仓库）\n7. **upstream**（TeamWiseflow 正式仓库）每次合并 PR 后自动更新版本号并触发 release 打包发布\n\n注：有时我会通过在 .claude/ 中留下 TODO.md 的方式下发开发任务，这些任务你完成后需要把 TODO.md 改名为 {date}_done.md\n\n### 版本管理\n\n版本号存储在 `version` 文件中，格式为 `vMAJOR.MINOR.PATCH`。当 PR 合并到 upstream 的 master 时，GitHub Action 自动递增版本号并创建 Release。通过 PR 标签控制递增类型：\n- `major` 标签 → 大版本升级\n- `minor` 标签 → 功能版本升级\n- 无标签或 `patch` 标签 → 补丁版本升级（默认）\n\n**不要手动修改 `version` 文件**，由 CI 自动维护。\n\n## Permissions\n\nClaude Code 被授权在本仓库中执行任何 git 命令（包括 push、branch、tag 等），无需逐次确认。\n"
  },
  {
    "path": "LICENSE",
    "content": "# Open Source License\n\nwiseflow is licensed under a modified version of the Apache License 2.0, with the following additional conditions:\n\n1. Wiseflow may be utilized commercially. Should the conditions below be met, a commercial license must be obtained from the producer:\n\na. Multi-tenant service: Unless explicitly authorized by Wiseflow in writing, you may not use the Wiseflow source code to operate a multi-tenant environment.\n    - Tenant Definition: Within the context of Wiseflow, one tenant corresponds to one workspace.\n The workspace provides a separated area for each tenant's data and configurations.\n\nb. LOGO and copyright information: In the process of using Wiseflow's frontend, you may not remove or modify the LOGO or copyright information in the Wiseflow console or applications. This restriction is inapplicable to uses of Wiseflow that do not involve its frontend.\n\n    - Frontend Definition: For the purposes of this license, the \"frontend\" of Wiseflow includes all components located in the `web/` directory when running Wiseflow from the raw source code, or the \"web\" image when running Wiseflow with Docker.\n\nc. Prohibited usage: Using Wiseflow for commercial web crawling or data harvesting operations.\n\nd. Prohibited usage: Using Wiseflow for any unlawful or unauthorized scraping, including activities that violate applicable laws, website terms of service, or robots exclusion directives.\n\ne. Prohibited usage: Using Wiseflow to obtain, copy, or distribute content from media platforms and trading platforms or other materials protected by third-party intellectual property rights, unless you have obtained prior explicit authorization from the rights holder.\n\n2. As a contributor, you should agree that:\n\na. The producer can adjust the open-source agreement to be more strict or relaxed as deemed necessary.\nb. Your contributed code may be used for commercial purposes, including but not limited to its cloud business operations.\n\nApart from the specific conditions mentioned above, all other rights and restrictions follow the Apache License 2.0. Detailed information about the Apache License 2.0 can be found at http://www.apache.org/licenses/LICENSE-2.0.\n\nThe interactive design of this product is protected by appearance patent.\n\n© 2025 Team Wiseflow"
  },
  {
    "path": "README.md",
    "content": "# Wiseflow\n\n**[English](README_EN.md) | [日本語](README_JP.md) | [한국어](README_KR.md) | [Deutsch](README_DE.md) | [Français](README_FR.md) | [العربية](README_AR.md)**\n\n🚀 **STEP INTO 5.x**\n\n> 📌 **寻找 4.x 版本？** 原版 v4.30 及之前版本的代码在 [`4.x` 分支](https://github.com/TeamWiseFlow/wiseflow/tree/4.x)中。\n\n```\n“吾生也有涯，而知也无涯。以有涯随无涯，殆已！“ —— 《庄子·内篇·养生主第三》\n```\n\nwiseflow 4.x(包括之前的版本) 通过一系列精密的 workflow 实现了在特定场景下的强大的获取能力，但依然存在诸多局限性：\n\n- 1. 无法获取交互式内容（需要经过点选才能出现的内容，尤其是动态加载的情况）\n- 2. 只能进行信息过滤与提取，几乎没有任何下游任务能力\n- ……\n\n虽然我们一直致力于完善它的功能、扩增它的边界，但真实世界是复杂的，真实的互联网也一样，规则永无可能穷尽，因此固定的 workflow 永远做不到适配所有场景，这不是 wiseflow 的问题，这是传统软件的问题！\n\n然而过去一年 Agent 的突飞猛进，让我们看到了由大模型驱动完全模拟人类互联网行为在技术上的可能，[openclaw](https://github.com/openclaw/openclaw) 的出现更让我们坚定了此信念。\n\n更奇妙的是，通过前期的实验和探索，我们发现将 wiseflow 的获取能力以”插件“形式融入 openclaw，即可以完美解决上面提到的两个局限性。\n\nhttps://github.com/user-attachments/assets/8d097b3b-f9ab-42eb-98bb-88af5d28b089\n\n需要说明的是：openclaw 的 plugin 系统与传统上我们理解的“插件”（类似 claude code 的 plugin）并不相同，因此我们不得不额外提出了“add-on\"的概念，所以确切的说，wiseflow5.x 将以 openclaw add-on 的形态出现。原版的 openclaw 并不具有”add-on“架构，不过实际上，你只需要几条简单的 shell 命令即可完成这个”改造“。我们也准备了开箱即用、同时包含一系列针对真实商用场景预设配置的 openclaw 强化版本，即 [openclaw_for_business](https://github.com/TeamWiseFlow/openclaw_for_business), 你可以直接 clone ，并将 wiseflow release 解压缩放置于 openclaw_for_business 的 add-on 文件夹内即可。\n\n## ✨ 通过安装 wiseflow 你能获得什么（强于原版 openclaw）？\n\n### 1. 反检测浏览器，且无需安装浏览器插件\n\nwiseflow 的 patch-001 将 openclaw 内置的 Playwright 替换为 [Patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright)（Playwright 的反检测 fork），显著降低自动化浏览器被目标网站识别和拦截的概率。从而实现不需要安装 chrome relay extension，只用托管浏览器也能达到与 relay 同样、甚至更优的网络获取与操作能力。\n\n📥 *我们综合考察了目前市面上流行的各浏览器自动化框架，包括 nodriver、browser-use、vercel 的 agent-browser等，目前可以确认的是虽然基本原理都是通过走 cdp 并提供持久化 openclaw 专用的 profile，但是只有 patchright 提供了完全的针对 CDP 探针的移除，换言之，即便是用最纯粹的 cdp 直连方案，也是带有特征的，即也是可以被检测到的。其他框架的定位是自动化测试目的，而非获取目的，而 patchright 本身就定位于获取，并且它本质上是 playwright 的 patch，继承了几乎全部的 playwright 上层 api，这就天然与 openclaw 兼容，不必额外安装任何插件或者mcp*\n\n### 2. 标签页自动恢复机制\n\n当 Agent 操作过程中目标标签页意外关闭或消失时，自动进行快照级别的标签页恢复，确保任务不会因标签页丢失而中断。\n\n### 3. Smart Search（智能搜索） Skill\n\n替代 openclaw 内置的 `web_search`，提供更强大的搜索能力。相比原版内置的 web search tool，Smart Search 具备三大核心优势：\n\n- **完全免费，无需 API Key**：不依赖任何第三方搜索 API，零成本使用\n- **即时搜索，时效性最佳**：直接驱动浏览器前往目标页面或各大社交媒体平台（微博、Twitter/X、facebook 等）进行搜索，第一时间获取最新发布的内容\n- **信源可自定义**：用户可以自由指定搜索源，精准匹配自己的信息需求\n\n### 4. 新媒体小编 Crew（预设 AI Agent）\n\n开箱即用的中文自媒体内容创作 AI Agent，深耕微博、小红书、知乎、B 站、抖音等国内主流平台。\n\n**主要能力：**\n\n- 选题研究 + 热点分析（Mode A）\n- 草稿扩写 + 网络佐证（Mode B）\n- 文章定稿后自动调用 [文颜（Wenyan）](https://github.com/caol64/wenyan) 渲染为公众号风格 HTML，支持 7 套内置主题智能匹配\n- 可直接推送微信公众号草稿箱（Mode C，需配置 `WECHAT_APP_ID`/`WECHAT_APP_SECRET`）\n- 支持 AI 文生图（[SiliconFlow](https://cloud.siliconflow.cn/i/WNLYbBpi) 图片/视频生成，需配置 `SILICONFLOW_API_KEY`）\n\n## 🌟 快速开始\n\n> **💡 模型费用说明**\n>\n> wiseflow5.x 底层基于 openclaw，Agent 工作流对 token 消耗有一定要求，建议先准备好大模型 API：\n>\n> - **国内用户（推荐）**：[硅基流动（SiliconFlow）](https://cloud.siliconflow.cn/i/WNLYbBpi) — 注册并实名认证可领取免费全平台模型代金券，覆盖上手阶段所需费用（配置模板中已预置 siliconflow.cn 的最佳实践，可直接使用）。😄 欢迎使用我的[推荐链接](https://cloud.siliconflow.cn/i/WNLYbBpi)注册，你我都会获赠 ¥16 平台奖励\n> - **OpenAI / 海外闭源模型**：推荐 [AiHubMix](https://aihubmix.com?aff=Gp54) — 国内直连无障碍。😄 欢迎使用我的[邀请链接](https://aihubmix.com?aff=Gp54)注册\n> - **海外用户**：可直接使用 SiliconFlow 国际版：https://www.siliconflow.com/\n\n直接从本代码仓的 [Releases](https://github.com/TeamWiseFlow/wiseflow/releases) 下载包含了 openclaw_for_business 和 wiseflow addon 的整合压缩包。\n\n1. 下载压缩包并解压缩\n2. 进入解压缩后的文件夹\n3. 根据需求选择启动方式：\n\n   **调试模式**（单次启动，适合测试和开发）：\n   ```bash\n   ./scripts/dev.sh gateway\n   ```\n\n   **生产模式**（安装为系统服务，适合长期运行）：\n   ```bash\n   ./scripts/reinstall-daemon.sh\n   ```\n\n> **系统要求**\n> - 推荐使用 **Ubuntu 22.04** 系统\n> - 支持 **Windows WSL2** 环境\n> - 支持 **macOS**\n> - **不支持**直接在 Windows（原生）下运行\n\n### 【备用】手动方案\n\n注意：你需要先下载部署 openclaw_for_business，下载地址为：https://github.com/TeamWiseFlow/openclaw_for_business/releases\n\n复制代码仓内的 wiseflow 文件夹（注意不是代码仓本身）到 openclaw_for_business 的 `addons/` 目录：\n\n```bash\n# 方式一：从 wiseflow 仓库复制\ngit clone https://github.com/TeamWiseFlow/wiseflow.git /tmp/wiseflow\ncp -r /tmp/wiseflow/wiseflow <openclaw_for_business>/addons/wiseflow\n```\n\n安装后重启 openclaw_for_business 即可生效。\n\n## 目录结构\n\n```\nwiseflow/                         # addon 包（复制到 addons/ 目录使用）\n├── addon.json                    # 元数据\n├── overrides.sh                  # pnpm overrides + 禁用内置 web_search\n├── patches/\n│   ├���─ 001-browser-tab-recovery.patch        # 标签页恢复补丁\n│   ├── 002-disable-web-search-env-var.patch  # 禁用内置 web_search（env var）\n│   └── 003-act-field-validation.patch        # ACT 字段校验补丁\n├── skills/                       # 全局技能（所有 Agent 可用）\n│   ├── browser-guide/SKILL.md    # 浏览器最佳实践（登录/验证码/懒加载等）\n│   ├── smart-search/SKILL.md     # 多平台搜索 URL 构造（替代内置 web_search）\n│   └── rss-reader/               # RSS/Atom Feed 读取器\n│       ├── SKILL.md\n│       ├── package.json\n│       └── scripts/fetch-rss.mjs\n└── crew/                         # 预设 AI Agent（Crew 模板）\n    └── new-media-editor/         # 新媒体小编（中文自媒体内容创作）\n        ├── IDENTITY.md / SOUL.md / AGENTS.md / TOOLS.md / ...\n        └── skills/               # Crew 专属技能\n            ├── siliconflow-img-gen/   # 文生图（SiliconFlow API）\n            ├── siliconflow-video-gen/ # 文生视频（SiliconFlow API）\n            └── wenyan-formatter/      # Markdown → 公众号 HTML / 推送草稿\n\ndocs/                             # 技术文档（代码仓根目录）\n├── anti-detection-research.md\n└── more_powerful_search_skill/\n\nscripts/                          # 工具脚本（代码仓根目录）\n└── generate-patch.sh\n\ntests/                            # 测试用例和脚本（代码仓根目录）\n├── README.md\n└── run-managed-tests.mjs\n```\n\n##  WiseFlow Pro 版本现已发布！\n\n更强的抓取能力、更全面的社交媒体支持、含 UI 界面和免部署一键安装包！\n\nhttps://github.com/user-attachments/assets/57f8569c-e20a-4564-a669-1200d56c5725\n\n🔥 **Pro 版本现已面向全网发售**：https://shouxiqingbaoguan.com/ \n\n🌹 即日起为 wiseflow 开源版本贡献 PR（代码、文档、成功案例分享均欢迎），一经采纳，贡献者将获赠 wiseflow pro版本一年使用权！\n\n## 🛡️ 许可协议\n\n自4.2版本起，我们更新了开源许可协议，敬请查阅： [LICENSE](LICENSE) \n\n商用合作，请联系 **Email：zm.zhao@foxmail.com**\n\n## 📬 联系方式\n\n有任何问题或建议，欢迎通过 [issue](https://github.com/TeamWiseFlow/wiseflow/issues) 留言。\n\n🎉 wiseflow && OFB 目前提供付费知识库，包含《手把手从零开始安装教程》、各种独门应用秘籍等，以及 **vip微信交流群**：\n\n欢迎添加”掌柜的“企业微信咨询了解：\n\n<img width=\"360\" height=\"360\" alt=\"wiseflow掌柜\" src=\"https://github.com/user-attachments/assets/b013b3fd-546e-4176-b418-57bee419e761\" />\n\n🌹 开源不易，感谢支持！\n\n## 🤝 wiseflow5.x 基于如下优秀的开源项目：\n\n- Patchright(Undetected Python version of the Playwright testing and automation library) https://github.com/Kaliiiiiiiiii-Vinyzu/patchright-python\n- Feedparser（Parse feeds in Python） https://github.com/kurtmckee/feedparser\n- SearXNG（a free internet metasearch engine which aggregates results from various search services and databases） https://github.com/searxng/searxng\n- 文颜（Wenyan）多平台 Markdown 排版与发布工具（新媒体小编 Crew 通过 wenyan-formatter 技能调用） https://github.com/caol64/wenyan\n\n## Citation\n\n如果您在相关工作中参考或引用了本项目的部分或全部，请注明如下信息：\n\n```\nAuthor：Wiseflow Team\nhttps://github.com/TeamWiseFlow/wiseflow\n```\n\n## 友情链接\n\n[<img src=\"https://github.com/TeamWiseFlow/wiseflow/raw/4.x/docs/logos/SiliconFlow.png\" alt=\"siliconflow\" width=\"360\">](https://cloud.siliconflow.cn/i/WNLYbBpi)\n"
  },
  {
    "path": "README_AR.md",
    "content": "<div dir=\"rtl\">\n\n# Wiseflow\n\n**[中文](README.md) | [English](README_EN.md) | [日本語](README_JP.md) | [한국어](README_KR.md) | [Deutsch](README_DE.md) | [Français](README_FR.md)**\n\n🚀 **STEP INTO 5.x**\n\n> 📌 **تبحث عن الإصدار 4.x؟** الكود الأصلي للإصدار v4.30 والإصدارات السابقة متوفر في [فرع `4.x`](https://github.com/TeamWiseFlow/wiseflow/tree/4.x).\n\n```\n\"حياتي لها حدود، لكن المعرفة بلا حدود. أن تلاحق اللامحدود بالمحدود — فذلك خطر محدق!\" — تشوانغ تزو، الفصول الداخلية، تغذية مبدأ الحياة\n```\n\nحقق wiseflow 4.x (بما في ذلك الإصدارات السابقة) قدرات قوية في جمع البيانات في سيناريوهات محددة من خلال سلسلة من سير العمل الدقيقة، لكنه لا يزال يعاني من قيود كبيرة:\n\n- 1. عدم القدرة على جمع المحتوى التفاعلي (المحتوى الذي لا يظهر إلا بعد النقر، خاصة في حالات التحميل الديناميكي)\n- 2. يقتصر على تصفية واستخراج المعلومات، مع غياب شبه كامل لقدرات معالجة المهام اللاحقة\n- ……\n\nعلى الرغم من أننا عملنا باستمرار على تحسين وظائفه وتوسيع حدوده، إلا أن العالم الحقيقي معقد، وكذلك الإنترنت. لا يمكن أن تكون القواعد شاملة أبداً، لذا فإن سير العمل الثابت لن يتمكن أبداً من التكيف مع جميع السيناريوهات. هذه ليست مشكلة wiseflow — إنها مشكلة البرمجيات التقليدية!\n\nومع ذلك، أظهر لنا التقدم السريع في تقنية الوكلاء (Agents) خلال العام الماضي الإمكانية التقنية لمحاكاة سلوك الإنسان على الإنترنت بالكامل بواسطة نماذج اللغة الكبيرة. وقد عزز ظهور [openclaw](https://github.com/openclaw/openclaw) هذا الاقتناع بشكل أكبر.\n\nوالأكثر إثارة للدهشة أنه من خلال تجاربنا واستكشافاتنا المبكرة، اكتشفنا أن دمج قدرات جمع البيانات في wiseflow في openclaw على شكل \"إضافات\" يحل المشكلتين المذكورتين أعلاه بشكل مثالي.\n\nhttps://github.com/user-attachments/assets/8d097b3b-f9ab-42eb-98bb-88af5d28b089\n\nتجدر الإشارة إلى أن نظام الإضافات في openclaw يختلف كثيراً عما نفهمه تقليدياً بـ\"الإضافات\" (المشابهة لإضافات Claude Code). لذلك اضطررنا إلى تقديم مفهوم \"add-on\". وبشكل دقيق، سيظهر wiseflow 5.x على شكل add-on لـ openclaw. لا يحتوي openclaw الأصلي على بنية \"add-on\"، لكن عملياً تحتاج فقط إلى بضعة أوامر shell بسيطة لإتمام هذا \"التحويل\". كما أعددنا نسخة معززة من openclaw جاهزة للاستخدام مع إعدادات مسبقة لسيناريوهات الأعمال الحقيقية: [openclaw_for_business](https://github.com/TeamWiseFlow/openclaw_for_business). يمكنك ببساطة استنساخها ووضع إصدار wiseflow في مجلد add-on الخاص بـ openclaw_for_business.\n\n## ✨ ما الذي ستكسبه بتثبيت wiseflow (أفضل من openclaw الأصلي)؟\n\n### 1. متصفح مضاد للكشف، دون الحاجة لتثبيت أي إضافات للمتصفح\n\nيستبدل patch-001 الخاص بـ wiseflow برنامج Playwright المدمج في openclaw بـ [Patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright) (نسخة fork غير قابلة للكشف من Playwright)، مما يقلل بشكل كبير من احتمالية اكتشاف المتصفحات الآلية وحجبها من قبل المواقع المستهدفة. وهذا يعني أنه دون الحاجة إلى تثبيت امتداد Chrome Relay، يمكن لمتصفح مُدار وحده تحقيق قدرات اكتساب وتشغيل الويب المماثلة لإعداد relay أو حتى أفضل منه.\n\n📥 *قمنا بتقييم جميع أطر عمل أتمتة المتصفح الرائجة في السوق، بما في ذلك nodriver وbrowser-use وagent-browser من Vercel. يمكننا التأكيد أنه رغم أن جميعها تعمل عبر CDP وتوفر ملفات تعريف مخصصة ومستمرة لـ openclaw، إلا أن Patchright وحده يوفر إزالة كاملة لبصمات CDP. بعبارة أخرى، حتى نهج الاتصال المباشر بـ CDP الأكثر نقاءً لا يزال يحمل توقيعات قابلة للكشف. تم تصميم الأطر الأخرى للاختبار الآلي، وليس لجمع البيانات، بينما تم تصميم Patchright خصيصاً للاستحواذ. ونظراً لأنه في جوهره تصحيح (patch) على Playwright، فإنه يرث تقريباً جميع واجهات برمجة التطبيقات عالية المستوى الخاصة بـ Playwright، مما يجعله متوافقاً بطبيعته مع openclaw دون الحاجة إلى تثبيت أي إضافات أو MCP إضافية.*\n\n### 2. آلية الاسترداد التلقائي لعلامات التبويب\n\nعندما تُغلق أو تُفقد علامة تبويب مستهدفة بشكل غير متوقع أثناء عملية Agent، يقوم النظام تلقائياً بإجراء استرداد علامة التبويب بناءً على لقطات الحالة، مما يضمن عدم انقطاع المهام بسبب فقدان علامة التبويب.\n\n### 3. مهارة البحث الذكي (Smart Search Skill)\n\nيحل محل `web_search` المدمج في openclaw بقدرات بحث أكثر قوة. مقارنةً بأداة web search المدمجة الأصلية، يتميز البحث الذكي بثلاث مزايا جوهرية:\n\n- **مجاني تماماً، لا يتطلب مفتاح API**: لا يعتمد على أي API بحث من طرف ثالث — تكلفة صفرية\n- **بحث فوري لأقصى درجات الحداثة**: يوجّه المتصفح مباشرةً إلى الصفحات المستهدفة أو منصات التواصل الاجتماعي الكبرى (ويبو، Twitter/X، Facebook، إلخ) للحصول فوراً على أحدث المنشورات\n- **مصادر بحث قابلة للتخصيص**: يمكن للمستخدمين تحديد مصادر بحثهم بحرية للحصول على معلومات دقيقة وموجّهة\n\n### 4. New Media Editor Crew (وكيل AI مسبق الإعداد)\n\nوكيل AI جاهز للاستخدام لإنشاء محتوى وسائل التواصل الاجتماعي الصينية، متخصص في المنصات الرئيسية الصينية مثل Weibo وXiaohongshu وZhihu وBilibili وDouyin.\n\n**القدرات الرئيسية:**\n\n- بحث الموضوعات + تحليل الاتجاهات (الوضع A)\n- توسيع المسودة + إضافة أدلة من الإنترنت (الوضع B)\n- بعد الانتهاء من المقال، استدعاء [Wenyan](https://github.com/caol64/wenyan) تلقائياً لتحويله إلى HTML بتنسيق حساب WeChat العام (7 قوالب مدمجة)\n- الدفع المباشر إلى صندوق مسودات حساب WeChat العام (الوضع C، يتطلب `WECHAT_APP_ID`/`WECHAT_APP_SECRET`)\n- دعم توليد الصور/الفيديو بالذكاء الاصطناعي ([SiliconFlow](https://www.siliconflow.com/) لتوليد الصور/الفيديو، يتطلب `SILICONFLOW_API_KEY`)\n\n## 🌟 البدء السريع\n\n> **💡 ��لاحظة حول تكاليف API**\n>\n> يعتمد wiseflow 5.x على سير عمل Agent الخاص بـ openclaw، مما يتطلب الوصول إلى واجهة برمجة تطبيقات LLM. نوصي بإعداد بيانات اعتماد API مسبقاً:\n>\n> - **المستخدمون الدوليون (موصى به)**: [SiliconFlow](https://www.siliconflow.com/) — رصيد مجاني متاح بعد التسجيل يغطي تكاليف الاستخدام الأولي\n> - **OpenAI / Anthropic ومزودون آخرون**: أي API متوافق يعمل\n\nقم بتنزيل الحزمة المتكاملة (التي تشمل openclaw_for_business وإضافة wiseflow) مباشرةً من [Releases](https://github.com/TeamWiseFlow/wiseflow/releases) لهذا المستودع.\n\n1. تنزيل الأرشيف وفك ضغطه\n2. الانتقال إلى المجلد المستخرج\n3. اختيار وضع التشغيل:\n\n**وضع التصحيح** (تشغيل مفرد، للاختبار والتطوير):\n\n<div dir=\"ltr\">\n\n```bash\n./scripts/dev.sh gateway\n```\n\n</div>\n\n**وضع الإنتاج** (التثبيت كخدمة نظام، للتشغيل طويل الأمد):\n\n<div dir=\"ltr\">\n\n```bash\n./scripts/reinstall-daemon.sh\n```\n\n</div>\n\n> **متطلبات النظام**\n> - يُنصح باستخدام نظام **Ubuntu 22.04**\n> - بيئة **Windows WSL2** مدعومة\n> - **macOS** مدعوم\n> - التشغيل المباشر على **Windows الأصلي** **غير مدعوم**\n\n### [بديل] التثبيت اليدوي\n\n> ملاحظة: تحتاج أولاً إلى تنزيل ��نشر openclaw_for_business من: https://github.com/TeamWiseFlow/openclaw_for_business/releases\n\nانسخ مجلد `wiseflow` من هذا المستودع (وليس المستودع بأكمله) إلى مجلد `addons/` الخاص بـ openclaw_for_business:\n\n<div dir=\"ltr\">\n\n```bash\n# الطريقة 1: الاستنساخ من مستودع wiseflow\ngit clone https://github.com/TeamWiseFlow/wiseflow.git /tmp/wiseflow\ncp -r /tmp/wiseflow/wiseflow <openclaw_for_business>/addons/wiseflow\n```\n\n</div>\n\nأعد تشغيل openclaw_for_business بعد التثبيت لتفعيل التغييرات.\n\n## هيكل المجلدات\n\n<div dir=\"ltr\">\n\n```\nwiseflow/                         # حزمة addon (انسخها إلى مجلد addons/)\n├── addon.json                    # البيانات الوصفية\n├── overrides.sh                  # pnpm overrides + تعطيل web_search المدمج\n├── patches/\n│   ├── 001-browser-tab-recovery.patch        # رقعة استعادة علامات التبويب\n│   ├── 002-disable-web-search-env-var.patch  # تعطيل web_search المدمج (env var)\n│   └── 003-act-field-validation.patch        # رقعة التحقق من حقول ACT\n├── skills/                       # المهارات العامة (متاحة لجميع الوكلاء)\n│   ├── browser-guide/SKILL.md    # أفضل ممارسات المتصفح (تسجيل الدخول/CAPTCHA/التحميل الكسول، إلخ)\n│   ├── smart-search/SKILL.md     # منشئ URL البحث متعدد المنصات (يحل محل web_search المدمج)\n│   └── rss-reader/               # قارئ خلاصات RSS/Atom\n│       ├── SKILL.md\n│       ├── package.json\n│       └── scripts/fetch-rss.mjs\n└── crew/                         # وكلاء AI مسبقو الإعداد (قوالب Crew)\n    └── new-media-editor/         # محرر الوسائط ��لجديدة (إنشاء محتوى وسائل التواصل الاجتماعي الصينية)\n        ├── IDENTITY.md / SOUL.md / AGENTS.md / TOOLS.md / ...\n        └── skills/               # مهارات خاصة بـ Crew\n            ├── siliconflow-img-gen/   # توليد صور AI (SiliconFlow API)\n            ├── siliconflow-video-gen/ # توليد فيديو AI (SiliconFlow API)\n            └── wenyan-formatter/      # Markdown → HTML WeChat / إرسال المسودة\n\ndocs/                             # التوثيق التقني (جذر المستودع)\n├── anti-detection-research.md\n└── more_powerful_search_skill/\n\nscripts/                          # النصوص البرمجية المساعدة (جذر المستودع)\n└── generate-patch.sh\n\ntests/                            # حالات الاختبار والنصوص البرمجية (جذر المستودع)\n├── README.md\n└── run-managed-tests.mjs\n```\n\n</div>\n\n## WiseFlow Pro متوفر الآن!\n\nقدرات استخراج أقوى، دعم أشمل لوسائل التواصل الاجتماعي، مع واجهة مستخدم وحزمة تثبيت بنقرة واحدة — لا حاجة للنشر!\n\nhttps://github.com/user-attachments/assets/57f8569c-e20a-4564-a669-1200d56c5725\n\n🔥 **النسخة الاحترافية معروضة للبيع الآن**: https://shouxiqingbaoguan.com/\n\n🌹 بدءاً من اليوم، ساهم بطلبات السحب (PR) في النسخة مفتوحة المصدر من wiseflow (الكود والتوثيق ومشاركة قصص النجاح مرحب بها). عند القبول، سيحصل المساهمون على ترخيص لمدة عام واحد لـ wiseflow Pro!\n\n## 🛡️ الترخيص\n\nمنذ الإصدار 4.2، قمنا بتحديث ترخيصنا مفتوح المصدر. يرجى الاطلاع على: [LICENSE](LICENSE)\n\nللتعاون التجاري، يرجى التواصل عبر **البريد الإلكتروني: zm.zhao@foxmail.com**\n\n## 📬 اتصل بنا\n\nلأي أسئلة أو اقتراحات، لا تتردد في ترك رسالة عبر [المشكلات](https://github.com/TeamWiseFlow/wiseflow/issues).\n\n🎉 يقدم wiseflow & OFB الآن **قاعدة معرفة مدفوعة**، تتضمن دروس تعليمية للتثبيت خطوة بخطوة، ونصائح تطبيقية حصرية، و**مجموعة WeChat VIP**:\n\nأضف \"Keeper\" على WeChat Enterprise للاستفسار:\n\n<img width=\"360\" height=\"360\" alt=\"wiseflow掌柜\" src=\"https://github.com/user-attachments/assets/b013b3fd-546e-4176-b418-57bee419e761\" />\n\n🌹 المصدر المفتوح يتطلب جهداً كبيراً — شكراً لدعمكم!\n\n## 🤝 wiseflow 5.x مبني على المشاريع مفتوحة المصدر الممتازة التالية:\n\n- Patchright (نسخة Python غير قابلة للكشف من مكتبة Playwright للاختبار والأتمتة) https://github.com/Kaliiiiiiiiii-Vinyzu/patchright-python\n- Feedparser (تحليل الخلاصات في Python) https://github.com/kurtmckee/feedparser\n- SearXNG (محرك بحث وصفي مجاني على الإنترنت يجمع النتائج من خدمات البحث وقواعد البيانات المختلفة) https://github.com/searxng/searxng\n- Wenyan (أداة تنسيق ونشر Markdown متعددة المنصات، يستخدمها New Media Editor Crew عبر مهارة wenyan-formatter) https://github.com/caol64/wenyan\n\n## الاستشهاد\n\nإذا أشرت إلى أو استشهدت بجزء أو كل هذا المشروع في عملك، يرجى تضمين المعلومات التالية:\n\n```\nAuthor: Wiseflow Team\nhttps://github.com/TeamWiseFlow/wiseflow\n```\n\n## الشركاء\n\n[<img src=\"https://github.com/TeamWiseFlow/wiseflow/raw/4.x/docs/logos/SiliconFlow.png\" alt=\"siliconflow\" width=\"360\">](https://siliconflow.com/)\n\n</div>\n"
  },
  {
    "path": "README_DE.md",
    "content": "# Wiseflow\n\n**[中文](README.md) | [English](README_EN.md) | [日本語](README_JP.md) | [한국어](README_KR.md) | [Français](README_FR.md) | [العربية](README_AR.md)**\n\n🚀 **STEP INTO 5.x**\n\n> 📌 **Suchen Sie 4.x?** Der ursprüngliche Code von v4.30 und früheren Versionen ist im [`4.x`-Branch](https://github.com/TeamWiseFlow/wiseflow/tree/4.x) verfügbar.\n\n```\n„Mein Leben hat Grenzen, doch das Wissen hat keine. Mit dem Begrenzten dem Grenzenlosen zu folgen — das ist gefährlich!\" — Zhuangzi, Innere Kapitel, Die Pflege des Lebensprinzips\n```\n\nWiseflow 4.x (einschließlich früherer Versionen) erreichte durch eine Reihe präziser Workflows leistungsstarke Datenerfassungsfähigkeiten in bestimmten Szenarien, hatte jedoch weiterhin erhebliche Einschränkungen:\n\n- 1. Interaktive Inhalte konnten nicht erfasst werden (Inhalte, die erst nach einem Klick erscheinen, insbesondere bei dynamischem Laden)\n- 2. Beschränkung auf Informationsfilterung und -extraktion, praktisch keine Fähigkeit zur Verarbeitung nachgelagerter Aufgaben\n- ……\n\nObwohl wir stets daran gearbeitet haben, die Funktionalität zu verbessern und die Grenzen zu erweitern, ist die reale Welt komplex — und das Internet ebenso. Regeln können niemals vollständig sein, daher kann ein fester Workflow niemals alle Szenarien abdecken. Dies ist kein Problem von wiseflow — es ist ein Problem traditioneller Software!\n\nDie rasante Entwicklung von Agenten im vergangenen Jahr hat uns jedoch die technische Möglichkeit gezeigt, menschliches Internetverhalten durch große Sprachmodelle vollständig zu simulieren. Das Erscheinen von [openclaw](https://github.com/openclaw/openclaw) hat diese Überzeugung weiter gestärkt.\n\nNoch bemerkenswerter ist, dass wir durch frühe Experimente und Erforschung entdeckt haben, dass die Integration der Erfassungsfähigkeiten von wiseflow als „Plugins\" in openclaw die beiden oben genannten Einschränkungen perfekt löst.\n\nhttps://github.com/user-attachments/assets/8d097b3b-f9ab-42eb-98bb-88af5d28b089\n\nEs ist jedoch zu beachten, dass das Plugin-System von openclaw sich erheblich von dem unterscheidet, was wir traditionell unter „Plugins\" verstehen (ähnlich den Plugins von Claude Code). Daher mussten wir das Konzept des „Add-ons\" einführen. Genau genommen wird wiseflow 5.x als openclaw Add-on erscheinen. Das originale openclaw verfügt nicht über eine „Add-on\"-Architektur, aber in der Praxis benötigen Sie nur wenige einfache Shell-Befehle, um diese „Umgestaltung\" durchzuführen. Wir haben auch eine sofort einsatzbereite, erweiterte Version von openclaw mit voreingestellten Konfigurationen für reale Geschäftsszenarien vorbereitet: [openclaw_for_business](https://github.com/TeamWiseFlow/openclaw_for_business). Sie können es einfach klonen und das wiseflow-Release in den Add-on-Ordner von openclaw_for_business entpacken.\n\n## ✨ Was erhalten Sie durch die Installation von wiseflow (überlegen dem originalen openclaw)?\n\n### 1. Anti-Erkennungs-Browser, keine Browser-Erweiterungen erforderlich\n\nwiseflow's patch-001 ersetzt das in openclaw integrierte Playwright durch [Patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright) (ein unerkannter Fork von Playwright) und reduziert damit erheblich die Wahrscheinlichkeit, dass automatisierte Browser von Ziel-Websites erkannt und blockiert werden. Dadurch lassen sich ohne die Installation einer Chrome-Relay-Extension mit einem verwalteten Browser gleichwertige oder sogar überlegene Web-Erfassungs- und Bedienungsfähigkeiten gegenüber einer Relay-Konfiguration erzielen.\n\n📥 *Wir haben alle derzeit populären Browser-Automatisierungs-Frameworks bewertet, darunter nodriver, browser-use und Vercels agent-browser. Wir können bestätigen, dass zwar alle über CDP arbeiten und beständige openclaw-spezifische Profile bereitstellen, aber nur Patchright eine vollständige Entfernung von CDP-Fingerprints bietet. Mit anderen Worten: Selbst der direkteste CDP-Verbindungsansatz hinterlässt nachweisbare Merkmale. Andere Frameworks sind für automatisierte Tests konzipiert, nicht für Datenerfassung, während Patchright speziell für die Erfassung entwickelt wurde. Da es sich im Wesentlichen um einen Patch auf Playwright handelt, erbt es fast alle High-Level-APIs von Playwright — und ist dadurch nativ mit openclaw kompatibel, ohne dass zusätzliche Erweiterungen oder MCP installiert werden müssen.*\n\n### 2. Automatischer Tab-Wiederherstellungsmechanismus\n\nWenn ein Ziel-Browser-Tab während eines Agent-Vorgangs unerwartet geschlossen oder verloren geht, führt das System automatisch eine snapshot-basierte Tab-Wiederherstellung durch, damit Aufgaben nicht durch Tab-Verlust unterbrochen werden.\n\n### 3. Smart Search Skill\n\nErsetzt die eingebaute `web_search` von openclaw durch leistungsfähigere Suchfunktionen. Im Vergleich zum ursprünglich integrierten web search tool bietet Smart Search drei zentrale Vorteile:\n\n- **Völlig kostenlos, kein API-Schlüssel erforderlich**: Keine Abhängigkeit von Drittanbieter-Such-APIs — null Kosten\n- **Echtzeit-Suche für maximale Aktualität**: Steuert den Browser direkt zu Zielseiten oder großen Social-Media-Plattformen (Weibo, Twitter/X, Facebook usw.), um die zuletzt veröffentlichten Inhalte sofort abzurufen\n- **Benutzerdefinierbare Suchquellen**: Benutzer können ihre Suchquellen frei festlegen, um präzise und zielgerichtete Informationsabfragen zu ermöglichen\n\n### 4. New-Media-Editor Crew (vorkonfigurierter KI-Agent)\n\nEin sofort einsatzbereiter KI-Agent zur Erstellung chinesischer Social-Media-Inhalte, spezialisiert auf die wichtigsten chinesischen Plattformen wie Weibo, Xiaohongshu, Zhihu, Bilibili und Douyin.\n\n**Hauptfähigkeiten:**\n\n- Themenrecherche + Trendanalyse (Modus A)\n- Entwurfserweiterung + Online-Belegung (Modus B)\n- Nach der Fertigstellung automatischer Aufruf von [Wenyan](https://github.com/caol64/wenyan) zur Darstellung als WeChat-Public-Account-HTML mit 7 integrierten Themen\n- Direktes Pushen in den WeChat-Public-Account-Entwurfsbereich (Modus C, erfordert `WECHAT_APP_ID`/`WECHAT_APP_SECRET`)\n- KI-Bild-/Videogenerierung ([SiliconFlow](https://www.siliconflow.com/) Bild/Video-Generierung, erfordert `SILICONFLOW_API_KEY`)\n\n## 🌟 Schnellstart\n\n> **💡 Hinweis zu API-Kosten**\n>\n> wiseflow 5.x basiert auf dem Agent-Workflow von openclaw und benötigt LLM-API-Zugang. Wir empfehlen, Ihre API-Zugangsdaten vorab vorzubereiten:\n>\n> - **Internationale Benutzer (empfohlen)**: [SiliconFlow](https://www.siliconflow.com/) — nach der Registrierung werden kostenlose Credits gutgeschrieben, die die Anfangskosten abdecken\n> - **OpenAI / Anthropic und andere Anbieter**: Jede kompatible API ist verwendbar\n\nLaden Sie das integrierte Paket (enthält openclaw_for_business und das wiseflow Addon) direkt aus den [Releases](https://github.com/TeamWiseFlow/wiseflow/releases) dieses Repositories herunter.\n\n1. Das Archiv herunterladen und entpacken\n2. In den entpackten Ordner wechseln\n3. Startmodus auswählen:\n\n   **Debug-Modus** (Einzelstart, für Tests und Entwicklung):\n   ```bash\n   ./scripts/dev.sh gateway\n   ```\n\n   **Produktionsmodus** (als Systemdienst installieren, für den Dauerbetrieb):\n   ```bash\n   ./scripts/reinstall-daemon.sh\n   ```\n\n> **Systemanforderungen**\n> - **Ubuntu 22.04** wird empfohlen\n> - **Windows WSL2**-Umgebung wird unterstützt\n> - **macOS** wird unterstützt\n> - Die direkte Ausführung unter **nativem Windows** wird **nicht unterstützt**\n\n### [Alternative] Manuelle Installation\n\n> Hinweis: Sie müssen zuerst openclaw_for_business herunterladen und deployen. Download-Adresse: https://github.com/TeamWiseFlow/openclaw_for_business/releases\n\nKopieren Sie den `wiseflow`-Ordner aus diesem Repository (nicht das Repository selbst) in das `addons/`-Verzeichnis von openclaw_for_business:\n\n```bash\n# Option 1: Aus dem wiseflow-Repository klonen\ngit clone https://github.com/TeamWiseFlow/wiseflow.git /tmp/wiseflow\ncp -r /tmp/wiseflow/wiseflow <openclaw_for_business>/addons/wiseflow\n```\n\nNach der Installation openclaw_for_business neu starten, um die Änderungen zu aktivieren.\n\n## Verzeichnisstruktur\n\n```\nwiseflow/                         # addon-Paket (in addons/-Verzeichnis kopieren)\n├── addon.json                    # Metadaten\n├── overrides.sh                  # pnpm overrides + integrierte web_search deaktivieren\n├── patches/\n│   ├── 001-browser-tab-recovery.patch        # Tab-Wiederherstellungs-Patch\n│   ├── 002-disable-web-search-env-var.patch  # Integrierte web_search deaktivieren (env var)\n│   └── 003-act-field-validation.patch        # ACT-Feldvalidierungs-Patch\n├── skills/                       # Globale Skills (für alle Agents verfügbar)\n│   ├── browser-guide/SKILL.md    # Best Practices für den Browser (Login/CAPTCHA/Lazy-Loading etc.)\n│   ├── smart-search/SKILL.md     # Multiplattform-Such-URL-Builder (ersetzt integrierte web_search)\n│   └── rss-reader/               # RSS/Atom Feed-Reader\n│       ├── SKILL.md\n│       ├── package.json\n│       └── scripts/fetch-rss.mjs\n└── crew/                         # Vorkonfigurierte KI-Agents (Crew-Vorlagen)\n    └── new-media-editor/         # New-Media-Editor (Chinesische Social-Media-Inhaltserstellung)\n        ├── IDENTITY.md / SOUL.md / AGENTS.md / TOOLS.md / ...\n        └── skills/               # Crew-spezifische Skills\n            ├── siliconflow-img-gen/   # KI-Bildgenerierung (SiliconFlow API)\n            ├── siliconflow-video-gen/ # KI-Videogenerierung (SiliconFlow API)\n            └── wenyan-formatter/      # Markdown → WeChat HTML / Entwurf pushen\n\ndocs/                             # Technische Dokumentation (Repository-Root)\n├── anti-detection-research.md\n└── more_powerful_search_skill/\n\nscripts/                          # Hilfsskripte (Repository-Root)\n└── generate-patch.sh\n\ntests/                            # Testfälle und Skripte (Repository-Root)\n├── README.md\n└── run-managed-tests.mjs\n```\n\n## WiseFlow Pro ist jetzt verfügbar!\n\nStärkere Scraping-Fähigkeiten, umfassendere Social-Media-Unterstützung, mit UI-Oberfläche und Ein-Klick-Installationspaket — keine Bereitstellung erforderlich!\n\nhttps://github.com/user-attachments/assets/57f8569c-e20a-4564-a669-1200d56c5725\n\n🔥 **Pro-Version ist jetzt im Verkauf**: https://shouxiqingbaoguan.com/\n\n🌹 Ab sofort: Beiträge (PRs) zur Open-Source-Version von wiseflow (Code, Dokumentation und erfolgreiche Fallstudien sind willkommen) — bei Annahme erhalten Mitwirkende eine einjährige Lizenz für wiseflow Pro!\n\n## 🛡️ Lizenz\n\nSeit Version 4.2 haben wir unsere Open-Source-Lizenz aktualisiert. Bitte beachten Sie: [LICENSE](LICENSE)\n\nFür kommerzielle Zusammenarbeit kontaktieren Sie bitte **Email: zm.zhao@foxmail.com**\n\n## 📬 Kontakt\n\nBei Fragen oder Vorschlägen hinterlassen Sie gerne eine Nachricht über [Issues](https://github.com/TeamWiseFlow/wiseflow/issues).\n\n🎉 wiseflow & OFB bieten jetzt eine **kostenpflichtige Wissensdatenbank** an, einschließlich Schritt-für-Schritt-Installationstutorials, exklusiver Anwendungstipps und einer **VIP-WeChat-Gruppe**:\n\nFügen Sie „Keeper\" auf WeChat Enterprise für Anfragen hinzu:\n\n<img width=\"360\" height=\"360\" alt=\"wiseflow掌柜\" src=\"https://github.com/user-attachments/assets/b013b3fd-546e-4176-b418-57bee419e761\" />\n\n🌹 Open Source erfordert viel Aufwand — vielen Dank für Ihre Unterstützung!\n\n## 🤝 wiseflow 5.x basiert auf folgenden hervorragenden Open-Source-Projekten:\n\n- Patchright (Unerkannte Python-Version der Playwright Test- und Automatisierungsbibliothek) https://github.com/Kaliiiiiiiiii-Vinyzu/patchright-python\n- Feedparser (Feeds in Python parsen) https://github.com/kurtmckee/feedparser\n- SearXNG (eine freie Internet-Metasuchmaschine, die Ergebnisse verschiedener Suchdienste und Datenbanken aggregiert) https://github.com/searxng/searxng\n- Wenyan (plattformübergreifendes Markdown-Formatierungs- und Veröffentlichungstool, vom New-Media-Editor-Crew über das wenyan-formatter-Skill verwendet) https://github.com/caol64/wenyan\n\n## Citation\n\nWenn Sie Teile oder das gesamte Projekt in Ihrer Arbeit referenzieren oder zitieren, geben Sie bitte folgende Informationen an:\n\n```\nAuthor: Wiseflow Team\nhttps://github.com/TeamWiseFlow/wiseflow\n```\n\n## Partner\n\n[<img src=\"https://github.com/TeamWiseFlow/wiseflow/raw/4.x/docs/logos/SiliconFlow.png\" alt=\"siliconflow\" width=\"360\">](https://siliconflow.com/)\n"
  },
  {
    "path": "README_EN.md",
    "content": "# Wiseflow\n\n**[中文](README.md) | [日本語](README_JP.md) | [한국어](README_KR.md) | [Deutsch](README_DE.md) | [Français](README_FR.md) | [العربية](README_AR.md)**\n\n🚀 **STEP INTO 5.x**\n\n> 📌 **Looking for 4.x?** The original v4.30 and earlier code is available on the [`4.x` branch](https://github.com/TeamWiseFlow/wiseflow/tree/4.x).\n\n```\n\"My life has a limit, but knowledge has none. To pursue the limitless with the limited — that is perilous!\" — Zhuangzi, Inner Chapters, Nourishing the Lord of Life\n```\n\nWiseflow 4.x (and earlier versions) achieved powerful data acquisition capabilities in specific scenarios through a series of precisely engineered workflows, but still had significant limitations:\n\n- 1. Unable to acquire interactive content (content that only appears after clicking, especially in dynamically loaded scenarios)\n- 2. Limited to information filtering and extraction, with virtually no downstream task capabilities\n- ……\n\nAlthough we have been dedicated to improving its functionality and expanding its boundaries, the real world is complex, and so is the real internet. Rules can never be exhaustive, so a fixed workflow can never adapt to all scenarios. This is not a problem with wiseflow — it's a problem with traditional software!\n\nHowever, the rapid advancement of Agents over the past year has shown us the technical possibility of fully simulating human internet behavior driven by large language models. The emergence of [openclaw](https://github.com/openclaw/openclaw) has further strengthened this belief.\n\nWhat's even more remarkable is that through our early experiments and exploration, we discovered that integrating wiseflow's acquisition capabilities into openclaw as \"plugins\" perfectly solves the two limitations mentioned above.\n\nhttps://github.com/user-attachments/assets/8d097b3b-f9ab-42eb-98bb-88af5d28b089\n\nIt should be noted that openclaw's plugin system is quite different from what we traditionally understand as \"plugins\" (similar to Claude Code's plugins). Therefore, we had to introduce the concept of \"add-on\". To be precise, wiseflow 5.x will appear in the form of an openclaw add-on. The original openclaw does not have an \"add-on\" architecture, but in practice, you only need a few simple shell commands to complete this \"transformation\". We have also prepared a ready-to-use enhanced version of openclaw with a series of preset configurations for real business scenarios: [openclaw_for_business](https://github.com/TeamWiseFlow/openclaw_for_business). You can simply clone it and extract the wiseflow release into the add-on folder of openclaw_for_business.\n\n## ✨ What Do You Gain by Installing wiseflow (Superior to Vanilla openclaw)?\n\n### 1. Anti-Detection Browser, No Browser Extensions Required\n\nwiseflow's patch-001 replaces openclaw's built-in Playwright with [Patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright) (an undetected fork of Playwright), significantly reducing the likelihood of automated browsers being identified and blocked by target websites. This means that without installing the Chrome relay extension, a managed browser alone can achieve the same — or even better — web acquisition and operation capabilities as a relay setup.\n\n📥 *We evaluated all major browser automation frameworks currently available, including nodriver, browser-use, and Vercel's agent-browser. We can confirm that while they all operate through CDP and provide persistent openclaw-specific profiles, only Patchright delivers complete removal of CDP fingerprints. In other words, even the most direct CDP connection approach still carries detectable signatures. Other frameworks are designed for automated testing, not for data acquisition, whereas Patchright was specifically built for acquisition. Since it is essentially a patch on top of Playwright, it inherits nearly all of Playwright's high-level APIs — making it natively compatible with openclaw without requiring any additional extensions or MCP.*\n\n### 2. Automatic Tab Recovery Mechanism\n\nWhen a target browser tab is unexpectedly closed or lost during an Agent operation, the system automatically performs snapshot-based tab recovery, ensuring tasks are not interrupted by tab loss.\n\n### 3. Smart Search Skill\n\nReplaces openclaw's built-in `web_search` with more powerful search capabilities. Compared to the original built-in web search tool, Smart Search has three core advantages:\n\n- **Completely free, no API key required**: Does not rely on any third-party search APIs — zero cost\n- **Real-time search for maximum timeliness**: Directly drives the browser to target pages or major social media platforms (Weibo, Twitter/X, Facebook, etc.) to search for the latest published content\n- **User-configurable search sources**: Users can freely specify their search sources for precise, targeted information retrieval\n\n### 4. New Media Editor Crew (Preset AI Agent)\n\nA ready-to-use Chinese social media content creation AI Agent, focused on major Chinese platforms including Weibo, Xiaohongshu, Zhihu, Bilibili, and Douyin.\n\n**Key capabilities:**\n\n- Topic research + trending analysis (Mode A)\n- Draft expansion + online fact support (Mode B)\n- After article finalization, automatically invokes [Wenyan](https://github.com/caol64/wenyan) to render WeChat public account-style HTML with 7 built-in themes\n- Direct push to WeChat public account draft box (Mode C, requires `WECHAT_APP_ID`/`WECHAT_APP_SECRET`)\n- AI image/video generation support ([SiliconFlow](https://www.siliconflow.com/) image/video generation, requires `SILICONFLOW_API_KEY`)\n\n## 🌟 Quick Start\n\n> **💡 API Cost Note**\n>\n> wiseflow 5.x is powered by openclaw's Agent workflow, which requires LLM API access. We recommend preparing your API credentials first:\n>\n> - **International users (recommended)**: [SiliconFlow](https://www.siliconflow.com/) — free credits available after registration, covering initial usage costs\n> - **OpenAI / Anthropic and other providers**: Any compatible API works\n\nDownload the integrated package (which includes openclaw_for_business and the wiseflow addon) directly from this repository's [Releases](https://github.com/TeamWiseFlow/wiseflow/releases).\n\n1. Download and extract the archive\n2. Enter the extracted directory\n3. Choose your startup mode:\n\n   **Debug mode** (single startup, for testing and development):\n   ```bash\n   ./scripts/dev.sh gateway\n   ```\n\n   **Production mode** (install as a system service for long-term operation):\n   ```bash\n   ./scripts/reinstall-daemon.sh\n   ```\n\n> **System Requirements**\n> - **Ubuntu 22.04** is recommended\n> - **Windows WSL2** environment is supported\n> - **macOS** is supported\n> - Running directly on **native Windows** is **not supported**\n\n### [Alternative] Manual Installation\n\n> Note: You need to first download and deploy openclaw_for_business from: https://github.com/TeamWiseFlow/openclaw_for_business/releases\n\nCopy the `wiseflow` folder from this repository (not the repository itself) to the `addons/` directory of openclaw_for_business:\n\n```bash\n# Option 1: Clone from the wiseflow repository\ngit clone https://github.com/TeamWiseFlow/wiseflow.git /tmp/wiseflow\ncp -r /tmp/wiseflow/wiseflow <openclaw_for_business>/addons/wiseflow\n```\n\nRestart openclaw_for_business after installation to take effect.\n\n## Directory Structure\n\n```\nwiseflow/                         # addon package (copy to addons/ directory)\n├── addon.json                    # Metadata\n├── overrides.sh                  # pnpm overrides + disable built-in web_search\n├── patches/\n│   ├── 001-browser-tab-recovery.patch        # Tab recovery patch\n│   ├── 002-disable-web-search-env-var.patch  # Disable built-in web_search (env var)\n│   └── 003-act-field-validation.patch        # ACT field validation patch\n├── skills/                       # Global skills (available to all Agents)\n│   ├── browser-guide/SKILL.md    # Browser best practices (login/captcha/lazy-loading, etc.)\n│   ├── smart-search/SKILL.md     # Multi-platform search URL builder (replaces built-in web_search)\n│   └── rss-reader/               # RSS/Atom Feed reader\n│       ├── SKILL.md\n│       ├── package.json\n│       └── scripts/fetch-rss.mjs\n└── crew/                         # Preset AI Agents (Crew templates)\n    └── new-media-editor/         # New Media Editor (Chinese social media content creation)\n        ├── IDENTITY.md / SOUL.md / AGENTS.md / TOOLS.md / ...\n        └── skills/               # Crew-specific skills\n            ├── siliconflow-img-gen/   # AI image generation (SiliconFlow API)\n            ├── siliconflow-video-gen/ # AI video generation (SiliconFlow API)\n            └── wenyan-formatter/      # Markdown → WeChat HTML / push draft\n\ndocs/                             # Technical documentation (repo root)\n├── anti-detection-research.md\n└── more_powerful_search_skill/\n\nscripts/                          # Utility scripts (repo root)\n└── generate-patch.sh\n\ntests/                            # Test cases and scripts (repo root)\n├── README.md\n└── run-managed-tests.mjs\n```\n\n## WiseFlow Pro is Now Available!\n\nStronger scraping capabilities, more comprehensive social media support, with UI interface and one-click installation package — no deployment needed!\n\nhttps://github.com/user-attachments/assets/57f8569c-e20a-4564-a669-1200d56c5725\n\n🔥 **Pro version is now on sale**: https://shouxiqingbaoguan.com/\n\n🌹 Starting today, contribute PRs to the wiseflow open-source version (code, documentation, and successful case studies are all welcome). Once accepted, contributors will receive a one-year license for wiseflow Pro!\n\n## 🛡️ License\n\nSince version 4.2, we have updated our open-source license. Please refer to: [LICENSE](LICENSE)\n\nFor commercial cooperation, please contact **Email: zm.zhao@foxmail.com**\n\n## 📬 Contact\n\nFor any questions or suggestions, feel free to leave a message via [issue](https://github.com/TeamWiseFlow/wiseflow/issues).\n\n🎉 wiseflow & OFB now offer a **paid knowledge base**, including step-by-step installation tutorials, exclusive application tips, and a **VIP WeChat group**:\n\nFeel free to add \"Keeper\" on WeChat Enterprise for inquiries:\n\n<img width=\"360\" height=\"360\" alt=\"wiseflow掌柜\" src=\"https://github.com/user-attachments/assets/b013b3fd-546e-4176-b418-57bee419e761\" />\n\n🌹 Open source takes effort — thank you for your support!\n\n## 🤝 wiseflow 5.x is built on the following excellent open-source projects:\n\n- Patchright (Undetected Python version of the Playwright testing and automation library) https://github.com/Kaliiiiiiiiii-Vinyzu/patchright-python\n- Feedparser (Parse feeds in Python) https://github.com/kurtmckee/feedparser\n- SearXNG (a free internet metasearch engine which aggregates results from various search services and databases) https://github.com/searxng/searxng\n- Wenyan (multi-platform Markdown formatting and publishing tool, used by the New Media Editor Crew via the wenyan-formatter skill) https://github.com/caol64/wenyan\n\n## Citation\n\nIf you reference or cite part or all of this project in your work, please include the following information:\n\n```\nAuthor: Wiseflow Team\nhttps://github.com/TeamWiseFlow/wiseflow\n```\n\n## Partners\n\n[<img src=\"https://github.com/TeamWiseFlow/wiseflow/raw/4.x/docs/logos/SiliconFlow.png\" alt=\"siliconflow\" width=\"360\">](https://siliconflow.com/)\n"
  },
  {
    "path": "README_FR.md",
    "content": "# Wiseflow\n\n**[中文](README.md) | [English](README_EN.md) | [日本語](README_JP.md) | [한국어](README_KR.md) | [Deutsch](README_DE.md) | [العربية](README_AR.md)**\n\n🚀 **STEP INTO 5.x**\n\n> 📌 **Vous cherchez la version 4.x ?** Le code original de la v4.30 et des versions antérieures est disponible sur la [branche `4.x`](https://github.com/TeamWiseFlow/wiseflow/tree/4.x).\n\n```\n« Ma vie a des limites, mais la connaissance n'en a point. Poursuivre l'illimité avec le limité — voilà qui est périlleux ! » — Zhuangzi, Chapitres intérieurs, Nourrir le principe vital\n```\n\nWiseflow 4.x (y compris les versions précédentes) a permis d'atteindre de puissantes capacités d'acquisition de données dans des scénarios spécifiques grâce à une série de workflows précis, mais présentait encore des limitations significatives :\n\n- 1. Incapacité à acquérir du contenu interactif (contenu qui n'apparaît qu'après un clic, en particulier dans les cas de chargement dynamique)\n- 2. Limité au filtrage et à l'extraction d'informations, avec pratiquement aucune capacité de traitement en aval\n- ……\n\nBien que nous nous soyons constamment efforcés d'améliorer ses fonctionnalités et d'étendre ses limites, le monde réel est complexe, tout comme l'internet. Les règles ne peuvent jamais être exhaustives, c'est pourquoi un workflow fixe ne peut jamais s'adapter à tous les scénarios. Ce n'est pas un problème de wiseflow — c'est un problème des logiciels traditionnels !\n\nCependant, les progrès fulgurants des Agents au cours de l'année écoulée nous ont montré la possibilité technique de simuler entièrement le comportement humain sur Internet grâce aux grands modèles de langage. L'apparition d'[openclaw](https://github.com/openclaw/openclaw) a renforcé davantage cette conviction.\n\nPlus remarquable encore, grâce à nos expériences et explorations préliminaires, nous avons découvert que l'intégration des capacités d'acquisition de wiseflow dans openclaw sous forme de « plugins » résout parfaitement les deux limitations mentionnées ci-dessus.\n\nhttps://github.com/user-attachments/assets/8d097b3b-f9ab-42eb-98bb-88af5d28b089\n\nIl convient de noter que le système de plugins d'openclaw diffère considérablement de ce que nous comprenons traditionnellement par « plugins » (similaires aux plugins de Claude Code). Nous avons donc dû introduire le concept d'« add-on ». Pour être précis, wiseflow 5.x apparaîtra sous la forme d'un add-on openclaw. L'openclaw original ne dispose pas d'une architecture « add-on », mais en pratique, vous n'avez besoin que de quelques commandes shell simples pour effectuer cette « transformation ». Nous avons également préparé une version améliorée d'openclaw prête à l'emploi avec des configurations prédéfinies pour des scénarios commerciaux réels : [openclaw_for_business](https://github.com/TeamWiseFlow/openclaw_for_business). Vous pouvez simplement le cloner et extraire la release wiseflow dans le dossier add-on d'openclaw_for_business.\n\n## ✨ Que gagnez-vous en installant wiseflow (supérieur à l'openclaw original) ?\n\n### 1. Navigateur anti-détection, sans extensions de navigateur\n\nLe patch-001 de wiseflow remplace le Playwright intégré d'openclaw par [Patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright) (un fork non détectable de Playwright), réduisant considérablement le risque que les navigateurs automatisés soient identifiés et bloqués par les sites cibles. Cela permet d'atteindre des capacités d'acquisition et d'opération web équivalentes, voire supérieures à celles d'une configuration relay, en utilisant uniquement un navigateur géré sans installer d'extension Chrome relay.\n\n📥 *Nous avons évalué tous les principaux frameworks d'automatisation de navigateur disponibles, notamment nodriver, browser-use et agent-browser de Vercel. Nous pouvons confirmer que bien qu'ils fonctionnent tous via CDP et fournissent des profils persistants dédiés à openclaw, seul Patchright assure la suppression complète des empreintes CDP. En d'autres termes, même l'approche de connexion CDP la plus directe laisse des signatures détectables. Les autres frameworks sont conçus pour les tests automatisés, non pour l'acquisition de données, tandis que Patchright a été spécifiquement conçu pour l'acquisition. Étant essentiellement un patch de Playwright, il hérite de presque toutes ses API de haut niveau — le rendant nativement compatible avec openclaw sans nécessiter d'extensions ou de MCP supplémentaires.*\n\n### 2. Mécanisme de récupération automatique des onglets\n\nLorsqu'un onglet cible est fermé ou perdu de manière inattendue lors d'une opération Agent, le système effectue automatiquement une récupération d'onglet basée sur des snapshots, garantissant que les tâches ne soient pas interrompues par une perte d'onglet.\n\n### 3. Smart Search Skill\n\nRemplace le `web_search` intégré d'openclaw par des capacités de recherche plus puissantes. Comparé à l'outil web search intégré d'origine, Smart Search présente trois avantages clés :\n\n- **Entièrement gratuit, sans clé API** : Ne dépend d'aucune API de recherche tierce — coût zéro\n- **Recherche en temps réel pour une actualité maximale** : Pilote directement le navigateur vers les pages cibles ou les grandes plateformes de médias sociaux (Weibo, Twitter/X, Facebook, etc.) pour récupérer immédiatement les contenus publiés récemment\n- **Sources de recherche personnalisables** : Les utilisateurs peuvent librement spécifier leurs sources de recherche pour une récupération d'informations précise et ciblée\n\n### 4. New Media Editor Crew (Agent IA préconfiguré)\n\nUn agent IA de création de contenu pour les réseaux sociaux chinois prêt à l'emploi, spécialisé dans les principales plateformes chinoises comme Weibo, Xiaohongshu, Zhihu, Bilibili et Douyin.\n\n**Capacités principales :**\n\n- Recherche de sujets + analyse des tendances (Mode A)\n- Expansion du brouillon + justification en ligne (Mode B)\n- Après finalisation de l'article, appel automatique de [Wenyan](https://github.com/caol64/wenyan) pour le rendre en HTML style compte officiel WeChat, avec 7 thèmes intégrés\n- Envoi direct vers la boîte de brouillons du compte officiel WeChat (Mode C, nécessite `WECHAT_APP_ID`/`WECHAT_APP_SECRET`)\n- Support de génération d'images/vidéos IA ([SiliconFlow](https://www.siliconflow.com/) génération d'images/vidéos, nécessite `SILICONFLOW_API_KEY`)\n\n## 🌟 Démarrage rapide\n\n> **💡 Note sur les coûts API**\n>\n> wiseflow 5.x repose sur le workflow Agent d'openclaw, qui nécessite un accès à l'API LLM. Nous recommandons de préparer vos identifiants API à l'avance :\n>\n> - **Utilisateurs internationaux (recommandé)** : [SiliconFlow](https://www.siliconflow.com/) — des crédits gratuits sont disponibles après inscription, couvrant les coûts initiaux\n> - **OpenAI / Anthropic et autres fournisseurs** : Toute API compatible fonctionne\n\nTéléchargez le package intégré (qui inclut openclaw_for_business et le wiseflow addon) directement depuis les [Releases](https://github.com/TeamWiseFlow/wiseflow/releases) de ce dépôt.\n\n1. Télécharger et extraire l'archive\n2. Accéder au dossier extrait\n3. Choisir le mode de démarrage :\n\n   **Mode débogage** (démarrage unique, pour les tests et le développement) :\n   ```bash\n   ./scripts/dev.sh gateway\n   ```\n\n   **Mode production** (installation en tant que service système, pour un fonctionnement à long terme) :\n   ```bash\n   ./scripts/reinstall-daemon.sh\n   ```\n\n> **Configuration requise**\n> - **Ubuntu 22.04** est recommandé\n> - L'environnement **Windows WSL2** est pris en charge\n> - **macOS** est pris en charge\n> - L'exécution directe sur **Windows natif** n'est **pas prise en charge**\n\n### [Alternative] Installation manuelle\n\n> Note : Vous devez d'abord télécharger et déployer openclaw_for_business depuis : https://github.com/TeamWiseFlow/openclaw_for_business/releases\n\nCopiez le dossier `wiseflow` de ce dépôt (pas le dépôt lui-même) dans le répertoire `addons/` d'openclaw_for_business :\n\n```bash\n# Option 1 : Cloner depuis le dépôt wiseflow\ngit clone https://github.com/TeamWiseFlow/wiseflow.git /tmp/wiseflow\ncp -r /tmp/wiseflow/wiseflow <openclaw_for_business>/addons/wiseflow\n```\n\nRedémarrez openclaw_for_business après l'installation pour que les changements prennent effet.\n\n## Structure des répertoires\n\n```\nwiseflow/                         # package addon (copier dans le répertoire addons/)\n├── addon.json                    # Métadonnées\n├── overrides.sh                  # pnpm overrides + désactiver web_search intégré\n├── patches/\n│   ├── 001-browser-tab-recovery.patch        # Patch de récupération d'onglets\n│   ├── 002-disable-web-search-env-var.patch  # Désactiver web_search intégré (env var)\n│   └── 003-act-field-validation.patch        # Patch de validation des champs ACT\n├── skills/                       # Skills globaux (disponibles pour tous les Agents)\n│   ├── browser-guide/SKILL.md    # Bonnes pratiques du navigateur (connexion/CAPTCHA/chargement différé, etc.)\n│   ├── smart-search/SKILL.md     # Constructeur d'URL de recherche multi-plateforme (remplace web_search intégré)\n│   └── rss-reader/               # Lecteur de flux RSS/Atom\n│       ├── SKILL.md\n│       ├── package.json\n│       └── scripts/fetch-rss.mjs\n└── crew/                         # Agents IA préconfigurés (modèles Crew)\n    └── new-media-editor/         # New Media Editor (création de contenu social media chinois)\n        ├── IDENTITY.md / SOUL.md / AGENTS.md / TOOLS.md / ...\n        └── skills/               # Skills spécifiques au Crew\n            ├── siliconflow-img-gen/   # Génération d'images IA (API SiliconFlow)\n            ├── siliconflow-video-gen/ # Génération de vidéos IA (API SiliconFlow)\n            └── wenyan-formatter/      # Markdown → HTML WeChat / envoi brouillon\n\ndocs/                             # Documentation technique (racine du dépôt)\n├── anti-detection-research.md\n└── more_powerful_search_skill/\n\nscripts/                          # Scripts utilitaires (racine du dépôt)\n└── generate-patch.sh\n\ntests/                            # Cas de test et scripts (racine du dépôt)\n├── README.md\n└── run-managed-tests.mjs\n```\n\n## WiseFlow Pro est maintenant disponible !\n\nDes capacités de scraping plus puissantes, un support plus complet des réseaux sociaux, avec interface graphique et package d'installation en un clic — aucun déploiement nécessaire !\n\nhttps://github.com/user-attachments/assets/57f8569c-e20a-4564-a669-1200d56c5725\n\n🔥 **La version Pro est en vente** : https://shouxiqingbaoguan.com/\n\n🌹 Dès aujourd'hui, contribuez des PRs à la version open source de wiseflow (code, documentation et partage de cas d'utilisation réussis sont les bienvenus). Une fois acceptées, les contributeurs recevront une licence d'un an pour wiseflow Pro !\n\n## 🛡️ Licence\n\nDepuis la version 4.2, nous avons mis à jour notre licence open source. Veuillez consulter : [LICENSE](LICENSE)\n\nPour une coopération commerciale, veuillez contacter **Email : zm.zhao@foxmail.com**\n\n## 📬 Contact\n\nPour toute question ou suggestion, n'hésitez pas à laisser un message via les [issues](https://github.com/TeamWiseFlow/wiseflow/issues).\n\n🎉 wiseflow & OFB proposent désormais une **base de connaissances payante**, incluant des tutoriels d'installation pas à pas, des astuces d'application exclusives et un **groupe WeChat VIP** :\n\nN'hésitez pas à ajouter « Keeper » sur WeChat Enterprise pour toute demande :\n\n<img width=\"360\" height=\"360\" alt=\"wiseflow掌柜\" src=\"https://github.com/user-attachments/assets/b013b3fd-546e-4176-b418-57bee419e761\" />\n\n🌹 L'open source demande beaucoup d'efforts — merci pour votre soutien !\n\n## 🤝 wiseflow 5.x est construit sur les excellents projets open source suivants :\n\n- Patchright (Version Python indétectable de la bibliothèque de test et d'automatisation Playwright) https://github.com/Kaliiiiiiiiii-Vinyzu/patchright-python\n- Feedparser (Analyse de flux en Python) https://github.com/kurtmckee/feedparser\n- SearXNG (un métamoteur de recherche internet gratuit qui agrège les résultats de divers services de recherche et bases de données) https://github.com/searxng/searxng\n- Wenyan (outil de formatage et de publication Markdown multi-plateforme, utilisé par le New Media Editor Crew via le skill wenyan-formatter) https://github.com/caol64/wenyan\n\n## Citation\n\nSi vous référencez ou citez tout ou partie de ce projet dans votre travail, veuillez inclure les informations suivantes :\n\n```\nAuthor : Wiseflow Team\nhttps://github.com/TeamWiseFlow/wiseflow\n```\n\n## Partenaires\n\n[<img src=\"https://github.com/TeamWiseFlow/wiseflow/raw/4.x/docs/logos/SiliconFlow.png\" alt=\"siliconflow\" width=\"360\">](https://siliconflow.com/)\n"
  },
  {
    "path": "README_JP.md",
    "content": "# Wiseflow\n\n**[中文](README.md) | [English](README_EN.md) | [한국어](README_KR.md) | [Deutsch](README_DE.md) | [Français](README_FR.md) | [العربية](README_AR.md)**\n\n🚀 **STEP INTO 5.x**\n\n> 📌 **4.x をお探しですか？** オリジナルの v4.30 以前のコードは [`4.x` ブランチ](https://github.com/TeamWiseFlow/wiseflow/tree/4.x)にありま��。\n\n```\n「我が生には涯（かぎり）有るも、知には涯無し。涯有るを以て涯無きに随（したが）うは、殆（あやう）きのみ！」—— 『荘子・内篇・養生主第三』\n```\n\nwiseflow 4.x（およびそれ以前のバージョン）は、一連の精密なワークフローによって特定のシナリオで強力なデータ取得能力を実現しましたが、依然として多くの制限がありました：\n\n- 1. インタラクティブなコンテンツを取得できない（クリックしないと表示されないコンテンツ、特に動的ロードの場合）\n- 2. 情報のフィルタリングと抽出のみで、下流タスク処理能力がほぼない\n- ……\n\n私たちは機能の改善と範囲の拡大に取り組んできましたが、現実の世界は複雑であり、インターネットも同様です。ルールを網羅することは不可能であるため、固定のワークフローではすべてのシナリオに対応できません。これは wiseflow の問題ではなく、従来のソフトウェアの問題です！\n\nしかし、この一年で Agent 技術が飛���的に進歩し、大規模言語モデルによって人間のインターネット行動を完全にシミュレートすることが技術的に可能であることが示されました。[openclaw](https://github.com/openclaw/openclaw) の登場は、この確信をさらに強めました。\n\nさらに驚くべきことに、初期の実験と探索を通じて、wiseflow のデータ取得能力を「プラグイン」として openclaw に統合することで、上記の2つの制限を完全に解決できることを発見しました。\n\nhttps://github.com/user-attachments/assets/8d097b3b-f9ab-42eb-98bb-88af5d28b089\n\nただし、openclaw のプラグインシステムは、従来の「プラグイン」（Claude Code のプラグインのようなもの）とは異なるため、「add-on」という概念を新たに導入する必要がありました。正確に言えば、wiseflow 5.x は openclaw の add-on として提供されます。オリジナルの openclaw には「add-on」アーキテクチャがありませんが、実際にはいくつかの簡単なシェルコマンドでこの「改造」を完了できます。また、実際のビジネスシーンに向���たプリセット設定を含む、すぐに使える openclaw 強化版 [openclaw_for_business](https://github.com/TeamWiseFlow/openclaw_for_business) も用意しています。クローンして、wiseflow のリリースを openclaw_for_business の add-on フォルダに配置するだけで使用できます。\n\n## ✨ wiseflow をインストールすることで何が得られますか（オリジナル openclaw より優れている点）？\n\n### 1. アンチ検出ブラウザ、ブラウザ拡張機能インストール不要\n\nwiseflow の patch-001 は、openclaw に内蔵された Playwright を [Patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright)（Playwright の検出回避フォーク）に置き換え、自動化ブラウザがターゲットサイトに検出・ブロックされる可能性を大幅に低減します。これにより、Chrome Relay Extension をインストールすることなく、マネージドブラウザだけで Relay と同等、あるいはそれ以上のウェブ取得・操作能力を実現できます。\n\n📥 *私たちは現在市場で人気のあるブラウザ自動化フレームワーク（nodriver、browser-use、Vercel の agent-browser など）を総合的に評価しました。すべてが CDP を通じて動作し、openclaw 専用の永続化プロファイルを提供するという基本原理は同じですが、CDP プローブの完全な除去を提供しているのは Patchright だけです。つまり、最も純粋な CDP 直接接続アプローチを使用しても、特徴的なフィンガープリントは残り、検出される可能性があります。他のフレームワークはデータ取得ではなく自動テストを目的として設計されていますが、Patchright はもともとデータ取得を目的として設計されており、本質的には Playwright のパッチであり、ほぼすべての Playwright の上位 API を継承しています。これにより openclaw とのネイティブな互換性が実現し、追加のプラグインや MCP をインストールする必要がありません。*\n\n### 2. タブ自動復元メカニズム\n\nAgent の操作中にターゲットタブが予期せず閉じられたり失われたりした場合、スナップショットベースのタブ復元を自動的に実行し、タブの消失によるタスクの中断を防ぎます。\n\n### 3. スマート検索 Skill\n\nopenclaw に内蔵された `web_search` をより強力な検索機能に置き換えます。オリジナルの内蔵 web search tool と比較して、スマート検索には3つの主要な優位性があります：\n\n- **完全無料、API キー不要**：サードパーティの検索 API に依存せず、ゼロコストで利用可能\n- **リアルタイム検索、最高の鮮度**：ブラウザを直接ターゲットページや主要なソーシャルメディアプラットフォーム（Weibo、Twitter/X、Facebook など）に誘導し、最新公開コンテンツを即座に取得\n- **検索ソースのカスタマイズ**：ユーザーが自由に検索ソースを指定でき、必要な情報を精確に取得\n\n### 4. 新媒体小編 Crew（プリセット AI エージェント）\n\nすぐに使える中国語ソーシャルメディアコンテンツ制作 AI エージェントで、微博、小紅書、知乎、B ステーション、抖音などの中国の主要プラットフォームに特化しています。\n\n**主な機能：**\n\n- テーマリサーチ + トレンド分析（モード A）\n- 下書き拡充 + オンライン根拠追加（モード B）\n- 記事確定後、[文颜（Wenyan）](https://github.com/caol64/wenyan) を自動呼び出して WeChat 公式アカウント形式の HTML にレンダリング（7 種類の内蔵テーマ対応）\n- WeChat 公式アカウントの下書きに直接プッシュ（モード C、`WECHAT_APP_ID`/`WECHAT_APP_SECRET` の設定が必要）\n- AI 画像/動画生成サポート（[SiliconFlow](https://www.siliconflow.com/) 画像/動画生成、`SILICONFLOW_API_KEY` の設定が必要）\n\n## 🌟 クイックスタート\n\n> **💡 API コストのご説明**\n>\n> wiseflow 5.x は openclaw の Agent ワークフローをベースにしており、LLM API アクセスが必要です。事前に API 資格情報を準備することをお勧めします：\n>\n> - **海外ユーザー（推奨）**：[SiliconFlow](https://www.siliconflow.com/) — 登録後に無料クレジットが付与され、初期使用コストをカバーできます\n> - **OpenAI / Anthropic その他のプロバイダー**：互換性のある任意の API が使用可能です\n\n本リポジトリの [Releases](https://github.com/TeamWiseFlow/wiseflow/releases) から openclaw_for_business と wiseflow addon を含む統合パッケージをダウンロードしてください。\n\n1. アーカイブをダウンロードして解凍する\n2. 解凍されたフォルダに移動する\n3. 起動方式を選択する：\n\n   **デバッグモード**（単回起動、テスト・開発向け）：\n   ```bash\n   ./scripts/dev.sh gateway\n   ```\n\n   **本番モード**（システムサービスとしてインストール、長期運用向け）：\n   ```bash\n   ./scripts/reinstall-daemon.sh\n   ```\n\n> **システム要件**\n> - **Ubuntu 22.04** を推奨\n> - **Windows WSL2** 環境をサポート\n> - **macOS** をサポート\n> - **Windows ネイティブ**環境での直接実行は**非対応**\n\n### 【代替】手動インストール\n\n> 注意：先に openclaw_for_business をダウンロード・デプロイする必要があります。ダウンロード先：https://github.com/TeamWiseFlow/openclaw_for_business/releases\n\n本リポジトリ内の `wiseflow` フォルダ（リポジトリ全体ではありません）を openclaw_for_business の `addons/` ディレクトリにコピーしてください：\n\n```bash\n# 方法1：wiseflow リポジトリからクローン\ngit clone https://github.com/TeamWiseFlow/wiseflow.git /tmp/wiseflow\ncp -r /tmp/wiseflow/wiseflow <openclaw_for_business>/addons/wiseflow\n```\n\nインストール後、openclaw_for_business を再起動すると有効になります。\n\n## ディレクトリ構造\n\n```\nwiseflow/                         # addon パッケージ（addons/ ディレクトリに配置）\n├── addon.json                    # メタデータ\n├── overrides.sh                  # pnpm overrides + 内蔵 web_search を無効化\n├── patches/\n│   ├── 001-browser-tab-recovery.patch        # タブ復元パッチ\n│   ├── 002-disable-web-search-env-var.patch  # 内蔵 web_search の無効化（env var）\n│   └── 003-act-field-validation.patch        # ACT フィールド検証パッチ\n├── skills/                       # グローバルスキル（全エージェント利用可能）\n│   ├── browser-guide/SKILL.md    # ブラウザのベストプラクティス（ログイン/CAPTCHA/遅延ロードなど）\n│   ├── smart-search/SKILL.md     # マルチプラットフォーム検索URL構築（内蔵 web_search の代替）\n│   └── rss-reader/               # RSS/Atom フィードリーダー\n│       ├── SKILL.md\n│       ├── package.json\n│       └── scripts/fetch-rss.mjs\n└── crew/                         # プリセット AI エージェント（Crew テンプレート）\n    └── new-media-editor/         # 新媒体小編（中国語ソーシャルメディアコンテンツ制作）\n        ├── IDENTITY.md / SOUL.md / AGENTS.md / TOOLS.md / ...\n        └── skills/               # Crew 専属スキル\n            ├── siliconflow-img-gen/   # AI 画像生成（SiliconFlow API）\n            ├── siliconflow-video-gen/ # AI 動画生成（SiliconFlow API）\n            └── wenyan-formatter/      # Markdown → WeChat HTML / 下書きプッシュ\n\ndocs/                             # 技術ドキュメント（リポジトリルート）\n├── anti-detection-research.md\n└── more_powerful_search_skill/\n\nscripts/                          # ユーティリティスクリプト（リポジトリルート）\n└── generate-patch.sh\n\ntests/                            # テストケースとスクリプト（リポジトリルート）\n├── README.md\n└── run-managed-tests.mjs\n```\n\n## WiseFlow Pro 版がリリースされました！\n\nより強力なスクレイピング能力、より包括的なソーシャルメディアサポート、UI インターフェースとワンクリックインストールパッケージ付き — デプロイ不要！\n\nhttps://github.com/user-attachments/assets/57f8569c-e20a-4564-a669-1200d56c5725\n\n🔥 **Pro 版が発売中**：https://shouxiqingbaoguan.com/\n\n🌹 本日より、wiseflow オープンソース版への PR 貢献（コード、ドキュメント、成功事例の共有すべて歓迎）が採用された場合、コントリビューターには wiseflow Pro 版の1年間ライセンスが贈呈されます！\n\n## 🛡️ ライセンス\n\nバージョン 4.2 以降、オープンソースライセンスを更新しました。詳細はこちら：[LICENSE](LICENSE)\n\n商用提携については **Email：zm.zhao@foxmail.com** までご連絡ください。\n\n## 📬 お問い合わせ\n\nご質問やご提案がございましたら、[issue](https://github.com/TeamWiseFlow/wiseflow/issues) からお気軽にメッセージをお寄せください。\n\n🎉 wiseflow && OFB では現在**有料ナレッジベース**を提供しています。内容には、ゼロからの手順インストールチュートリアル、各種独自の活用ノウハウ、および **VIP WeChat グループ**が含まれます：\n\nご相談は「掌柜的」企業 WeChat をご追加ください：\n\n<img width=\"360\" height=\"360\" alt=\"wiseflow掌柜\" src=\"https://github.com/user-attachments/assets/b013b3fd-546e-4176-b418-57bee419e761\" />\n\n🌹 オープンソース維持のご支援に感謝します！\n\n## 🤝 wiseflow 5.x は以下の優秀なオープンソースプロジェクトを基盤としています：\n\n- Patchright（Playwright テスト・自動化ライブラリの検出回避 Python 版）https://github.com/Kaliiiiiiiiii-Vinyzu/patchright-python\n- Feedparser（Python でフィードを解析）https://github.com/kurtmckee/feedparser\n- SearXNG（様々な検索サービスやデータベースから結果を集約する無料のインターネットメタ検索エンジン）https://github.com/searxng/searxng\n- 文颜（Wenyan）（多プラットフォーム Markdown フォーマットと投稿ツール、新媒体小編 Crew が wenyan-formatter スキル経由で使用）https://github.com/caol64/wenyan\n\n## Citation\n\n本プロジェクトの一部または全部を参照・引用する場合は、以下の情報を明記してください：\n\n```\nAuthor：Wiseflow Team\nhttps://github.com/TeamWiseFlow/wiseflow\n```\n\n## パートナー\n\n[<img src=\"https://github.com/TeamWiseFlow/wiseflow/raw/4.x/docs/logos/SiliconFlow.png\" alt=\"siliconflow\" width=\"360\">](https://siliconflow.com/)\n"
  },
  {
    "path": "README_KR.md",
    "content": "# Wiseflow\n\n**[中文](README.md) | [English](README_EN.md) | [日本語](README_JP.md) | [Deutsch](README_DE.md) | [Français](README_FR.md) | [العربية](README_AR.md)**\n\n🚀 **STEP INTO 5.x**\n\n> 📌 **4.x를 찾고 계신가요?** 원래 v4.30 이전 버전의 코드는 [`4.x` 브랜치](https://github.com/TeamWiseFlow/wiseflow/tree/4.x)에서 확인할 수 있습니다.\n\n```\n\"내 삶에는 한계가 있지만, 지식에는 한계가 없다. 유한한 것으로 무한한 것을 쫓으니, 위태로울 뿐이다!\" —— 『장자·내편·양생주제삼』\n```\n\nwiseflow 4.x(이전 버전 포함)는 일련의 정밀한 워크플로우를 통해 특정 시나리오에서 강력한 데이터 수집 능력을 구현했지만, 여전히 많은 한계가 존재했습니다:\n\n- 1. 인터랙티브 콘텐츠를 수집할 수 없음 (클릭해야만 나타나는 콘텐츠, 특히 동적 로딩의 경우)\n- 2. 정보 필터링과 추출만 가능하며, 다운스트림 작업 처리 능력이 거의 없음\n- ……\n\n기능 개선과 범위 확장에 꾸준히 노력해 왔지만, 현실 세계는 복잡하고 인터넷도 마찬가지입니다. 규칙을 완전히 망라하는 것은 불가능하므로, 고정된 워크플로우로는 모든 시나리오에 대응할 수 없습니다. 이것은 wiseflow의 문제가 아니라 전통적인 소프트웨어의 문제입니다!\n\n그러나 지난 1년간 Agent 기술의 비약적인 발전은 대규모 언어 모델로 인간의 인터넷 행동을 완전히 시뮬레이션하는 것이 기술적으로 가능하다는 것을 보여주었습니다. [openclaw](https://github.com/openclaw/openclaw)의 등장은 이러한 확신을 더욱 굳건히 했습니다.\n\n더욱 놀라운 것은, 초기 실험과 탐색을 통해 wiseflow의 데이터 수집 능력을 \"플러그인\" 형태로 openclaw에 통합하면 위에서 언급한 두 가지 한계를 완벽하게 해결할 수 있다는 것을 발견했습니다.\n\nhttps://github.com/user-attachments/assets/8d097b3b-f9ab-42eb-98bb-88af5d28b089\n\n다만, openclaw의 플러그인 시스템은 우리가 전통적으로 이해하는 \"플러그인\"(Claude Code의 플러그인과 유사한 것)과는 다르기 때문에, \"add-on\"이라는 개념을 별도로 도입해야 했습니다. 정확히 말하면, wiseflow 5.x는 openclaw add-on 형태로 제공됩니다. 원래 openclaw에는 \"add-on\" 아키텍처가 없지만, 실제로는 몇 가지 간단한 셸 명령어만으로 이 \"개조\"를 완료할 수 있습니다. 또한 실제 비즈니스 시나리오를 위한 프리셋 설정이 포함된 즉시 사용 가능한 openclaw 강화 버전인 [openclaw_for_business](https://github.com/TeamWiseFlow/openclaw_for_business)도 준비했습니다. 클론한 후 wiseflow 릴리스를 openclaw_for_business의 add-on 폴더에 배치하면 됩니다.\n\n## ✨ wiseflow를 설치하면 무엇을 얻을 수 있나요（원본 openclaw보다 우수한 점）？\n\n### 1. 탐지 방지 브라우저, 브라우저 확장 프로그램 설치 불필요\n\nwiseflow의 patch-001은 openclaw 내장 Playwright를 [Patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright)(Playwright의 탐지 방지 포크)로 교체하여, 자동화 브라우저가 대상 웹사이트에 감지·차단될 가능성을 크게 줄입니다. 이를 통해 Chrome Relay Extension 설치 없이, 관리형 브라우저만으로도 Relay와 동등하거나 더 뛰어난 웹 수집 및 조작 능력을 달성할 수 있습니다.\n\n📥 *저희는 nodriver, browser-use, Vercel의 agent-browser 등 현재 시장에서 인기 있는 모든 브라우저 자동화 프레임워크를 종합적으로 평가했습니다. 모두 CDP를 통해 동작하고 openclaw 전용 지속적 프로필을 제공한다는 기본 원리는 같지만, CDP 프로브를 완전히 제거하는 것은 Patchright뿐입니다. 즉, 가장 순수한 CDP 직접 연결 방식을 사용하더라도 여전히 검출 가능한 특징이 남아 있습니다. 다른 프레임워크는 데이터 수집이 아닌 자동화 테스트를 목적으로 설계되었지만, Patchright는 처음부터 데이터 수집을 목적으로 설계되었습니다. 본질적으로 Playwright의 패치이기 때문에 거의 모든 Playwright 상위 API를 그대로 계승하며, 이로 인해 openclaw와 기본적으로 호환되어 추가 플러그인이나 MCP를 설치할 필요가 없습니다.*\n\n### 2. 자동 탭 복구 메커니즘\n\nAgent 작업 중 대상 탭이 예기치 않게 닫히거나 사라질 경우, 스냅샷 기반 탭 복구를 자동으로 수행하여 탭 소실로 인한 작업 중단을 방지합니다.\n\n### 3. 스마트 검색 Skill\n\nopenclaw 내장 `web_search`를 더욱 강력한 검색 기능으로 대체합니다. 원버전 내장 web search tool 대비 스마트 검색의 세 가지 핵심 강점:\n\n- **완전 무료, API 키 불필요**: 서드파티 검색 API에 의존하지 않아 비용 제로\n- **실시간 검색, 최고의 시의성**: 브라우저를 직접 대상 페이지나 주요 소셜 미디어 플랫폼(Weibo, Twitter/X, Facebook 등)으로 ��동하여 최신 게시물을 즉시 검색\n- **검색 출처 사용자 정의 가능**: 사용자가 검색 출처를 자유롭게 지정하여 필요한 정보를 정확하게 취득\n\n### 4. 새 미디어 편집자 Crew（사전 설정 AI 에이전트）\n\n즉시 사용 가능한 중국어 소셜 미디어 콘텐츠 제작 AI 에이전트로, 웨이보, 샤오홍슈, 즈후, 빌리빌리, 더우인 등 중국 주요 플랫폼에 특화되어 있습니다.\n\n**주요 기능：**\n\n- 주제 리서치 + 트렌드 분석（Mode A）\n- 초안 확장 + 온라인 근거 추가（Mode B）\n- 기사 완성 후 [文颜（Wenyan）](https://github.com/caol64/wenyan)을 자동으로 호출하여 위챗 공식 계정 스타일 HTML로 렌더링（내장 테마 7개 지원）\n- 위챗 공식 계정 임시 보관함에 직접 발행（Mode C, `WECHAT_APP_ID`/`WECHAT_APP_SECRET` 설정 필요）\n- AI 이미지/영상 생성 지원（[SiliconFlow](https://www.siliconflow.com/) 이미지/영상 생성, `SILICONFLOW_API_KEY` 설정 필요）\n\n## 🌟 빠른 시작\n\n> **💡 API 비용 안내**\n>\n> wiseflow 5.x는 openclaw의 Agent 워크플로우를 기반으로 하며, LLM API 접근이 필요합니다. 사전에 API 자격 증명을 준비하시기 바랍니다:\n>\n> - **해외 사용자（권장）**：[SiliconFlow](https://www.siliconflow.com/) — 등록 후 무료 크레딧 지급, 초기 사용 비용 충당 가능\n> - **OpenAI / Anthropic 및 기타 제공업체**：호환 가능한 모든 API 사용 가능\n\n본 저장소의 [Releases](https://github.com/TeamWiseFlow/wiseflow/releases)에서 openclaw_for_business와 wiseflow addon이 포함된 통합 패키지를 다운로드하세요.\n\n1. 압축 파일을 다운로드하고 압축을 해제합니다\n2. 압축 해제된 폴더로 이동합니다\n3. 시작 방식을 선택합니다:\n\n   **디버그 모드**（단회 실행, 테스트 및 개발용）:\n   ```bash\n   ./scripts/dev.sh gateway\n   ```\n\n   **프로덕션 모드**（시스템 서비��로 설치, 장기 운영용）:\n   ```bash\n   ./scripts/reinstall-daemon.sh\n   ```\n\n> **시스템 요구사항**\n> - **Ubuntu 22.04** 권장\n> - **Windows WSL2** 환경 지원\n> - **macOS** 지원\n> - **Windows 네이티브** 환경에서의 직접 실행은 **지원하지 않음**\n\n### 【대안】수동 설치\n\n> 주의: 먼저 openclaw_for_business를 다운로드하여 배포해야 합니다. 다운로드 주소: https://github.com/TeamWiseFlow/openclaw_for_business/releases\n\n저장소 내의 `wiseflow` 폴더(저장소 전체가 아님)를 openclaw_for_business의 `addons/` 디렉토리에 복사하세요:\n\n```bash\n# 방법 1: wiseflow 저장소에서 클론\ngit clone https://github.com/TeamWiseFlow/wiseflow.git /tmp/wiseflow\ncp -r /tmp/wiseflow/wiseflow <openclaw_for_business>/addons/wiseflow\n```\n\n설치 후 openclaw_for_business를 재시작하면 적용됩니다.\n\n## 디렉토리 구조\n\n```\nwiseflow/                         # addon 패키지（addons/ 디렉토리에 복사）\n├── addon.json                    # 메타데이터\n├── overrides.sh                  # pnpm overrides + 내장 web_search 비활성화\n├── patches/\n│   ├── 001-browser-tab-recovery.patch        # 탭 복구 패치\n│   ├── 002-disable-web-search-env-var.patch  # 내장 web_search 비활성화 (env var)\n│   └── 003-act-field-validation.patch        # ACT 필드 유효성 검사 패치\n├── skills/                       # 글로벌 스킬（모든 에이전트 사용 가능）\n│   ├── browser-guide/SKILL.md    # 브라우저 모범 사례 (로그인/캡차/지연 로딩 등)\n│   ├── smart-search/SKILL.md     # 다중 플랫폼 검색 URL 빌더 (내장 web_search 대체)\n│   └── rss-reader/               # RSS/Atom 피드 리더\n│       ├── SKILL.md\n│       ├── package.json\n│       └── scripts/fetch-rss.mjs\n└── crew/                         # 사전 설정 AI 에이전트（Crew 템플릿）\n    └── new-media-editor/         # 새 미디어 편집자（중국어 소셜 미디어 콘텐츠 제작）\n        ├── IDENTITY.md / SOUL.md / AGENTS.md / TOOLS.md / ...\n        └── skills/               # Crew 전용 스킬\n            ├── siliconflow-img-gen/   # AI 이미지 생성（SiliconFlow API）\n            ├── siliconflow-video-gen/ # AI 영상 생성（SiliconFlow API）\n            └── wenyan-formatter/      # Markdown → 위챗 HTML / 임시 보관함 발행\n\ndocs/                             # 기술 문서（저장소 루트）\n├── anti-detection-research.md\n└── more_powerful_search_skill/\n\nscripts/                          # 유틸리티 스크립트（저장소 루트）\n└── generate-patch.sh\n\ntests/                            # 테스트 케이스 및 스크립트（저장소 루트）\n├── README.md\n└── run-managed-tests.mjs\n```\n\n## WiseFlow Pro 버전 출시!\n\n더 강력한 스크래핑 능력, 더 포괄적인 소셜 미디어 지원, UI 인터페이스 및 원클릭 설치 패키지 — 배포 불필요!\n\nhttps://github.com/user-attachments/assets/57f8569c-e20a-4564-a669-1200d56c5725\n\n🔥 **Pro 버전 판매 중**: https://shouxiqingbaoguan.com/\n\n🌹 오늘부터 wiseflow 오픈소스 버전에 PR 기여(코드, 문서, 성공 사례 공유 모두 환영)가 채택되면, 기여자에게 wiseflow Pro 버전 1년 사용권이 증정됩니다!\n\n## 🛡️ 라이선스\n\n버전 4.2부터 오픈소스 라이선스를 업데이트했습니다. 자세한 내용은: [LICENSE](LICENSE)\n\n상업적 협력 문의: **Email: zm.zhao@foxmail.com**\n\n## 📬 연락처\n\n질문이나 제안이 있으시면 [issue](https://github.com/TeamWiseFlow/wiseflow/issues)를 통해 메시지를 남겨주세요.\n\n🎉 wiseflow && OFB에서 현재 **유료 지식 베이스**를 제공하고 있습니다. 내용에는 단계별 설치 튜토리얼, 각종 독점 활용 팁, **VIP 위챗 그룹**이 포함됩니다：\n\n\"掌柜的\" 기업 위챗을 추가하여 문의하세요：\n\n<img width=\"360\" height=\"360\" alt=\"wiseflow掌柜\" src=\"https://github.com/user-attachments/assets/b013b3fd-546e-4176-b418-57bee419e761\" />\n\n🌹 오픈소스 유지에 응원해 주셔서 감사합니다!\n\n## 🤝 wiseflow 5.x는 다음의 우수한 오픈소스 프로젝트를 기반으로 합니다:\n\n- Patchright (Playwright 테스트 및 자동화 라이브러리의 탐지 우회 Python 버전) https://github.com/Kaliiiiiiiiii-Vinyzu/patchright-python\n- Feedparser (Python으로 피드 파싱) https://github.com/kurtmckee/feedparser\n- SearXNG (다양한 검색 서비스와 데이터베이스에서 결과를 집계하는 무료 인터넷 메타 검색 엔진) https://github.com/searxng/searxng\n- Wenyan (다중 플랫폼 Markdown 서식 및 게시 도구, 새 미디어 편집자 Crew가 wenyan-formatter 스킬을 통해 사용) https://github.com/caol64/wenyan\n\n## Citation\n\n본 프로젝트의 일부 또는 전체를 참조하거나 인용하는 경우, 다음 정보를 명시해 주세요:\n\n```\nAuthor: Wiseflow Team\nhttps://github.com/TeamWiseFlow/wiseflow\n```\n\n## 파트너\n\n[<img src=\"https://github.com/TeamWiseFlow/wiseflow/raw/4.x/docs/logos/SiliconFlow.png\" alt=\"siliconflow\" width=\"360\">](https://siliconflow.com/)\n"
  },
  {
    "path": "docs/anti-detection-research.md",
    "content": "# 浏览器自动化反检测方案调研报告\n\n> 调研日期：2026-02-20\n> 目标：评估 rebrowser-patches 与 patchright 两个方案，为 OpenClaw 集成反检测能力提供技术路线\n\n---\n\n## 一、问题背景\n\n### 1.1 为什么本地 AI 助手也需要反检测\n\nOpenClaw 作为本地 AI 助手，通过浏览器执行用户指令（比价、填表、读取信息、搜索等）。目标网站不区分\"本地助手\"和\"恶意爬虫\"——检测到自动化特征就会触发防御：\n\n- 电商比价 → 触发验证码或封 IP\n- 表单提交 → 被拒绝\n- 银行/邮箱 → 风控要求重新验证\n- Google 搜索 → CAPTCHA\n- 后台管理 → WAF 拦截\n\n### 1.2 Playwright 的主要检测泄露点\n\n| 泄露点 | 检测原理 | 严重程度 |\n|--------|---------|---------|\n| `Runtime.enable` CDP 调用 | 激活 Runtime domain 后浏览器内部行为变化，反爬脚本可通过侧信道检测 | **致命** |\n| `Console.enable` CDP 调用 | 类似 Runtime.enable 的侧信道 | 高 |\n| `--enable-automation` 启动参数 | 设置 `navigator.webdriver = true` | 高 |\n| `//# sourceURL=pptr:evaluate` | 注入脚本带有特征性 sourceURL 注释 | 中 |\n| `__playwright_utility_world__` | utility world 名称可被检测 | 中 |\n| init script 通过 CDP 注入 | `Page.addScriptToEvaluateOnNewDocument` 有检测方法 | 中 |\n| Playwright 自带 Chromium | 定制版浏览器与正版 Chrome 有指纹差异 | 中（connectOverCDP 时不存在） |\n| 大量自动化启动参数 | 参数组合本身是指纹 | 低-中 |\n\n### 1.3 OpenClaw 当前状态\n\n- 使用 `playwright-core@1.58.2`\n- 已经通过 `chromium.connectOverCDP()` 连接浏览器（不用 launch）\n- Extension relay 模式下连接用户真实 Chrome，零启动参数\n- 但 **Playwright driver 层的 CDP 泄露未处理**（Runtime.enable、Console.enable 等）\n\n---\n\n## 二、方案 A：rebrowser-patches\n\n### 2.1 项目概况\n\n- GitHub: `rebrowser/rebrowser-patches` (1.2k stars)\n- 生态：rebrowser-playwright-core（drop-in 替代包）、rebrowser-puppeteer 等\n- 方式：对现有 playwright-core 源码打补丁（Unix `patch` 命令）\n- 支持：Puppeteer + Playwright\n\n### 2.2 修补内容\n\n#### Runtime.enable 修复 — 3 种模式\n\n**模式 1：`addBinding`（默认，推荐）**\n\n```\n原理：\n1. 生成随机名称 binding（如 \"x7k2m9q\"）\n2. 通过 Runtime.addBinding 注册（不需要 Runtime.enable）\n3. 在 isolated world 中 dispatch 自定义事件触发 binding\n4. 从 Runtime.bindingCalled 回调拿到真实 executionContextId\n5. 用该 contextId 做后续 Runtime.evaluate\n\n优势：\n- 完全不调用 Runtime.enable\n- 保留对 main world 的完整访问（能读页面变量）\n- 支持 web workers 和 iframes\n```\n\n**模式 2：`alwaysIsolated`**\n\n```\n原理：所有脚本执行都在 Page.createIsolatedWorld 创建的隔离上下文中\n\n优势：完全隔离，防止 MutationObserver 检测\n劣势：无法访问 main world 变量，不支持 web workers\n```\n\n**模式 3：`enableDisable`**\n\n```\n原理：快速 enable → 捕获 context ID → 立即 disable\n\n优势：完整 main world 访问\n劣势：有短暂时间窗口可能被检测到\n```\n\n#### 其他修复\n\n| 补丁 | 效果 | 配置 |\n|------|------|------|\n| sourceURL 伪装 | `pptr:evaluate` → `app.js`（可自定义） | `REBROWSER_PATCHES_SOURCE_URL=jquery.min.js` |\n| utility world 名称 | `__puppeteer_utility_world__` → `util`（可自定义） | `REBROWSER_PATCHES_UTILITY_WORLD_NAME=util` |\n\n### 2.3 未修复的内容\n\n- Console.enable — 未处理（但也未禁用，保留了完整功能）\n- init script 注入方式 — 仍走 CDP\n- CSP — 未处理\n- Closed Shadow Root — 未处理\n- 启动参数 — 未修改（需自行配置）\n- 指纹伪装 — 不在范围内\n\n### 2.4 配置方式\n\n```bash\n# 环境变量（运行时可切换）\nREBROWSER_PATCHES_RUNTIME_FIX_MODE=addBinding    # 默认\nREBROWSER_PATCHES_RUNTIME_FIX_MODE=alwaysIsolated\nREBROWSER_PATCHES_RUNTIME_FIX_MODE=enableDisable\nREBROWSER_PATCHES_RUNTIME_FIX_MODE=0              # 禁用\n\nREBROWSER_PATCHES_SOURCE_URL=app.js\nREBROWSER_PATCHES_UTILITY_WORLD_NAME=util\nREBROWSER_PATCHES_DEBUG=1\n```\n\n### 2.5 集成方式\n\n```bash\n# 方式 1：打补丁（npm install 后需重新执行）\nnpx rebrowser-patches@latest patch --packageName playwright-core\n\n# 方式 2：替换包（推荐，一劳永逸）\n# package.json:\n#   \"playwright-core\": \"1.58.2\" → \"rebrowser-playwright-core\": \"1.58.2\"\n# 无需改 import 路径\n```\n\n方案 A 测试下来与 openclaw 存在一定兼容性问题。\n\nopenclaw 使用 playwright 1.58.2 版本，但是 rebrowser-patches 最新只在 playwright 1.52.0 版本上进行过全面测试。\n\nopenclaw 高度依赖 playwright 的私有 api，比如 _snapshotForAI() 等。应用 rebrowser-patches 后，此接口无法工作。\n\n---\n\n## 三、方案 B：Patchright\n\n### 3.1 项目概况\n\n- GitHub: `Kaliiiiiiiiii-Vinyzu/patchright` + `patchright-python` (1.1k stars)\n- 方式：fork Playwright 源码，通过 ts-morph AST 重写 22 个核心模块，编译为独立包\n- 支持：仅 Playwright\n- 自动化：每小时检查 Playwright 新版本，自动 patch 并发布\n\n### 3.2 修补内容\n\n| 补丁 | 详情 |\n|------|------|\n| **Runtime.enable 移除** | 从 crPage、crDevTools、crServiceWorker 中直接删除调用 |\n| **Console.enable 禁用** | 完全移除 Console domain |\n| **启动参数清理** | 移除 `--enable-automation` 等 6 个参数，添加 `--disable-blink-features=AutomationControlled` |\n| **init script 注入** | 改为 HTTP route interception → 在 HTML `<head>` 中注入 `<script>` 标签 |\n| **CSP bypass** | 自动修改 Content-Security-Policy，添加 nonce/unsafe-inline |\n| **sourceURL 移除** | 删除所有 `//# sourceURL` 注释 |\n| **Service Worker** | 静默阻止注册（移除 console.warn 暴露信息） |\n| **Closed Shadow Root** | 支持穿透 mode:'closed' 的 Shadow DOM |\n| **evaluate() 改造** | 新增 `isolated_context` 参数（默认 true） |\n\n### 3.3 代价\n\n| 失去的能力 | 影响 |\n|-----------|------|\n| **Console API 完全禁用** | `page.on(\"console\")` 永远不触发 |\n| **page.pause() 调试** | 未知是否受影响 |\n| **Console 日志收集** | 需要替代方案（JS 注入） |\n\n### 3.4 集成方式\n\n```bash\n# 必须改 import 路径\n# package.json:\n#   \"playwright-core\": \"1.58.2\" → \"patchright-core\": \"1.57.0\"\n# 所有源码:\n#   import { chromium } from \"playwright-core\" → import { chromium } from \"patchright-core\"\n```\n\n已在本仓库落地（2026-02-22）：\n\n- `openclaw/package.json` 已切换到 `patchright-core@1.57.0`（当前 npm 可用最新版本）\n- `openclaw/src/browser/*` 中所有 `playwright-core` import 已切换为 `patchright-core`\n- `scripts/dev.sh` / `scripts/apply-patches.sh` 已移除 rebrowser 自动补丁流程\n- 上游改动已生成业务补丁：`patches/001-switch-playwright-to-patchright-core.patch`\n\n### 3.5 已知问题（来自 GitHub Issues）\n\n- `#94` 新版本反而被检测到（dist-info 暴露）\n- `#100` Cloudflare 403 错误\n- `#101` Google Anti-Bot 触发\n- `#170` Sannysoft 检测到 patchright\n\n---\n\n## 四、方案对比\n\n### 4.1 核心差异\n\n| 维度 | rebrowser-patches | Patchright |\n|------|-------------------|-----------|\n| **修补方式** | 运行时打补丁 / drop-in 包 | 编译时 fork 重写 |\n| **改动侵入性** | 低 — 可一键回退 | 高 — 需改所有 import |\n| **Runtime.enable** | 3 种模式可选，默认 addBinding | 单一方案 isolated context |\n| **Console API** | **保留** | **禁用** |\n| **main world 访问** | ✅ addBinding 模式完整保留 | ⚠️ isolated context 有局限 |\n| **init script 注入** | 未改（仍走 CDP） | ✅ 改为 HTML 注入 |\n| **CSP bypass** | 未处理 | ✅ 自动处理 |\n| **Closed Shadow Root** | 未处理 | ✅ 支持穿透 |\n| **启动参数清理** | 未处理（需自行配置） | ✅ 自动清理 |\n| **配置灵活性** | 环境变量运行时切换 | 编译时固定 |\n| **`_snapshotForAI` 兼容** | 大概率兼容（改动面小） | 风险较高（重写面广） |\n| **OpenClaw console 收集** | ✅ 不受影响 | ❌ 需要改造 |\n| **GitHub dependents** | 有 drop-in 替代包生态 | 0 个已知依赖项目 |\n| **反检测通过率** | 未公开完整测试 | 声称通过 Cloudflare/Kasada/Datadome 等（但有 issue 反馈失败） |\n\n### 4.2 适用场景判断\n\n| 场景 | 推荐方案 | 原因 |\n|------|---------|------|\n| **OpenClaw 集成（首选）** | rebrowser-patches | Console API 保留、改动小、风险低、可回退 |\n| **需要最激进反检测** | Patchright | init script HTML 注入 + CSP bypass + closed shadow root |\n| **快速验证可行性** | rebrowser-patches | 一行命令打补丁，不改代码 |\n| **长期维护** | 两者均可 | rebrowser 有 drop-in 包；patchright 自动跟踪上游 |\n\n---\n\n## 五、OpenClaw 集成改造方案\n\n### 5.1 Phase 1：最小改动验证（rebrowser-patches）\n\n**目标**：零代码修改，验证基本兼容性\n\n```bash\ncd openclaw\n# 打补丁\nnpx rebrowser-patches@latest patch --packageName playwright-core\n\n# 设置环境变量\nexport REBROWSER_PATCHES_RUNTIME_FIX_MODE=addBinding\nexport REBROWSER_PATCHES_SOURCE_URL=app.js\nexport REBROWSER_PATCHES_UTILITY_WORLD_NAME=util\n\n# 启动 OpenClaw 并测试\n```\n\n**验证清单**：\n- [ ] OpenClaw 正常启动\n- [ ] `_snapshotForAI()` 正常工作\n- [ ] `page.on(\"console\")` 事件正常触发\n- [ ] `page.evaluate()` 正常执行\n- [ ] 页面导航和元素交互正常\n- [ ] Extension relay 模式正常工作\n- [ ] 非 Extension 模式正常工作\n\n### 5.2 Phase 2：反检测效果测试\n\n**目标**：量化反检测提升\n\n使用以下检测网站逐一测试：\n\n| 检测站 | URL | 测试项 |\n|--------|-----|-------|\n| CreepJS | `https://nicepkg.github.io/nicepkg-test/` | 综合指纹 |\n| Sannysoft | `https://bot.sannysoft.com/` | navigator.webdriver 等 |\n| Incolumitas | `https://bot.incolumitas.com/` | 高级检测 |\n| Browserscan | `https://browserscan.net/` | 浏览器指纹 |\n| Pixelscan | `https://pixelscan.net/` | 指纹一致性 |\n\n**测试矩阵**（6 种组合）：\n\n```\n                          未打补丁    rebrowser-patches\n非 Extension 模式:          A1              A2\nExtension + 真实 Chrome:    B1              B2\n```\n\n**对比指标**：\n- 各检测站得分/通过项\n- `navigator.webdriver` 值\n- Runtime.enable 是否泄露\n- CreepJS Trust Score\n\n### 5.3 Phase 3：Patchright 对比测试\n\n**目标**：评估 Patchright 的额外收益是否值得代价\n\n```bash\ncd openclaw\n# 替换包\n# package.json: \"playwright-core\" → \"patchright-core\"\n# 批量替换 import（8 个文件）\n# 改造 console 收集逻辑\n\n# 同样跑 Phase 2 的测试矩阵\n```\n\n**额外验证**：\n- [ ] `_snapshotForAI()` 是否兼容（最关键）\n- [ ] Console 收集替代方案是否可靠\n- [ ] init script HTML 注入是否带来额外通过率\n\n### 5.4 Phase 4：生产化改造（基于 Phase 2/3 结果选择方案）\n\n**如果选 rebrowser-patches**：\n```json\n// package.json\n{\n  \"dependencies\": {\n    \"rebrowser-playwright-core\": \"1.58.2\"  // 替代 playwright-core\n  }\n}\n```\n\n**额外加固**（无论选哪个方案）：\n- [ ] OpenClaw 非 Extension 模式：添加 `--disable-blink-features=AutomationControlled` 到启动参数\n- [ ] OpenClaw 非 Extension 模式：移除 `--enable-automation` 等自动化参数\n- [ ] 审查 `cdp.ts` 中的 `Runtime.enable` 直接调用，评估是否可以移除\n- [ ] Extension 模式：考虑添加 `chrome.runtime.onStartup` 自动重连\n\n---\n\n## 六、OpenClaw raw CDP 层清理\n\nOpenClaw 自有的 `cdp.ts` 直接通过 WebSocket 发送 CDP 命令，绕过了 Playwright driver：\n\n```typescript\n// 这些调用不经过 Playwright，不受 rebrowser-patches / patchright 影响\nsend(\"Runtime.enable\")\nsend(\"Runtime.evaluate\", { expression, awaitPromise })\nsend(\"Runtime.terminateExecution\")\n```\n\n**需要评估**：\n1. `Runtime.enable` 是否可以移除？在只做 `Runtime.evaluate` 的场景下，某些 Chrome 版本不需要先 enable\n2. `Runtime.terminateExecution` 是否需要先 enable？需测试\n3. 如果必须 enable，可以参考 rebrowser-patches 的 enableDisable 模式（快速 enable → 拿到 contextId → 立即 disable）\n\n---\n\n## 七、检测面全景图\n\n改造完成后，两种模式的检测面：\n\n### 非 Extension 模式（rebrowser-patches + 参数清理）\n\n```\n✅ 已消除:\n  - Runtime.enable 泄露（rebrowser-patches addBinding 模式）\n  - sourceURL 特征（rebrowser-patches）\n  - utility world 名称（rebrowser-patches）\n  - navigator.webdriver（--disable-blink-features=AutomationControlled）\n  - --enable-automation 参数\n\n⚠️ 仍存在:\n  - --remote-debugging-port 参数（必需）\n  - Console.enable（rebrowser-patches 未处理）\n  - Playwright ���带 Chromium 指纹（如果不用 connectOverCDP）\n  - OpenClaw cdp.ts 的 Runtime.enable（需单独清理）\n  - init script 通过 CDP 注入（rebrowser-patches 未改）\n```\n\n### Extension + 真实 Chrome 模式（rebrowser-patches + 参数清理）\n\n```\n✅ 已消除:\n  - Runtime.enable 泄露\n  - sourceURL 特征\n  - utility world 名称\n  - navigator.webdriver\n  - --remote-debugging-port（Extension 不需要）\n  - --enable-automation（Extension 不需要）\n  - 浏览器指纹差异（真实 Chrome）\n  - 空 Profile（真实用户 Profile）\n  - 所有自动化启动参数（零参数）\n\n⚠️ 仍存在:\n  - Console.enable\n  - OpenClaw cdp.ts 的 Runtime.enable（需单独清理）\n  - init script 通过 CDP 注入\n  - Chrome 扩展可能被探测（chrome://extensions 可见）\n  - chrome.debugger 调试横幅（页面 JS 不可检测，但用户可见）\n```\n\n如果用 Patchright 替代 rebrowser-patches，Console.enable 和 init script 注入也可以消除，但代价是失去 Console API 和更高的兼容风险。\n\n---\n\n## 八、参考文件索引\n\n### 项目源码\n\n| 项目 | 关键文件 | 用途 |\n|------|---------|------|\n| OpenClaw | `src/browser/pw-session.ts:335` | `chromium.connectOverCDP()` 连接入口 |\n| OpenClaw | `src/browser/pw-session.ts:217-283` | 页面事件监听（含 console） |\n| OpenClaw | `src/browser/pw-tools-core.snapshot.ts:54-62` | `_snapshotForAI()` 调用 |\n| OpenClaw | `src/browser/extension-relay.ts` | Extension relay 服务器 |\n| OpenClaw | `assets/chrome-extension/background.js` | 扩展核心逻辑 |\n| OpenClaw | `src/browser/cdp.ts` | Raw CDP 层（含 Runtime.enable） |\n| rebrowser-patches | `patches/playwright-core/src.patch` | Playwright 核心补丁 |\n| rebrowser-patches | `scripts/patcher.js` | 补丁应用脚本 |\n| Patchright | `patchright_driver_patch.js` | 主编排脚本 |\n| Patchright | `driver_patches/crPagePatch.js` | 最大补丁（~470 行） |\n| Patchright | `driver_patches/crNetworkManagerPatch.js` | HTTP 注入补丁（~465 行） |\n\n### 检测原理参考\n\n| 检测向量 | 说明 |\n|---------|------|\n| `Runtime.enable` leak | CDP domain 激活后浏览器内部行为变化，可被页面 JS 通过侧信道检测 |\n| `navigator.webdriver` | `--enable-automation` 会设置此属性为 true |\n| `sourceURL` fingerprint | 注入脚本的 `//# sourceURL=pptr:evaluate` 注释暴露自动化框架 |\n| utility world detection | 命名为 `__playwright_utility_world__` 的执行上下文可被枚举 |\n| Chrome binary fingerprint | Playwright 自带 Chromium 的 UA/WebGL/内部 API 与正版 Chrome 有差异 |\n| launch args fingerprint | 大量 `--disable-*` 参数的组合是自动化特征 |\n| empty profile | 无历史记录、无 Cookie、空 localStorage 是强自动化信号 |\n\n---\n\n## 九、明日实验计划\n\n### 优先级排序\n\n1. **rebrowser-patches 基础验证**（Phase 1）— 30 分钟\n   - 打补丁 → 启动 OpenClaw → 基本功能测试\n\n2. **反检测效果量化**（Phase 2）— 1 小时\n   - 6 种组合 × 5 个检测站 = 30 次测试\n\n3. **Patchright 验证**（Phase 3，如果 Phase 2 不够好）— 1-2 小时\n   - 替换包 → 兼容性测试 → 反检测测试\n\n4. **cdp.ts 清理评估**（Phase 4）— 视前几步结果而定\n\n### 预期结论\n\n- 如果 rebrowser-patches 的 addBinding 模式能通过主流检测站且 OpenClaw 功能正常 → 选 rebrowser-patches\n- 如果 rebrowser-patches 不够且 Patchright 额外通过了关键检测 → 评估 Patchright 的兼容成本是否可接受\n- 如果两者都不够 → 考虑组合方案（rebrowser-patches + 额外 JS 注入补充）\n"
  },
  {
    "path": "docs/more_powerful_search_skill/20260308_done.md",
    "content": "# 目的一\n\n为 wiseflow add-on 增加一个 skill，旨在让 Agent 通过使用 skill 可以更好的操作浏览器完成各种搜索任务。替换 openclaw 内置的 web_search 工具。\n\n## 实现方案\n\n解析用户指令，按已知规则直接构造查询 url。对于 filter 和 sort 要求，按各平台摸索出的方法指导 agent。\n\n其中常见社交媒体平台的搜索 url 构造和具体指导见  ./direct_url_for_search_on_media_platform.md\n\n额外的去 /extra 文件夹下挨个分析里面的 python 脚本，提炼出查询 url，添加到本 skill 的支持列表中。\n\n# 目的二\n\n为 wiseflow add-on 增加一个 skill，旨在让 Agent 通过使用 skill 获取和解析 rss 信源\n\n## 实现方案\n\n参考 rss_parsor.py\n\n"
  },
  {
    "path": "docs/more_powerful_search_skill/direct_url_for_search_on_media_platform.md",
    "content": "# 自媒体平台的搜索\n\n## bilibili（哔哩哔哩，简称：b站）：\n\nhttps://search.bilibili.com/{channel}?keyword={keyword}\n\nkeyword 多个的话，之间用 + 连接\n\nchannel 可选：\n\n- 综合：all\n- 视频：video\n- 番剧：bangumi\n- 影视：pgc\n- 直播：live\n- 专栏：article\n- 用户：upuser\n\n每个 channel 都可以指定搜索规则（默认不指定任何），具体规则如下：\n\n### all\n\n支持的搜索规则包括：\n- 最多播放：&order=click\n- 最新发布：&order=pubdate\n- 最多弹幕：&order=dm\n- 最多收藏：&order=stow\n\n示例：\n\n- 默认搜索：https://search.bilibili.com/all?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83\n- 最多播放：https://search.bilibili.com/all?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83&order=click\n\n### video\n\n支持的搜索规则包括：\n- 最多播放：&order=click\n- 最新发布：&order=pubdate\n- 最多弹幕：&order=dm\n- 最多收藏：&order=stow\n\n示例：\n\n- 默认搜索：https://search.bilibili.com/video?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83\n- 最新发布：https://search.bilibili.com/video?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83&order=pubdate\n\n### bangumi\n\n这个 channel 不支持搜索规则，只有默认搜索\n\n### pgc\n\n这个 channel 不支持搜索规则，只有默认搜索\n\n### live\n\n支持的搜索规则包括（默认是搜全部）：\n\n- 搜主播：&search_type=live_user\n- 搜直播间：&search_type=live_room\n- 按最新开播顺序搜直播间:search_type=live_room&order=live_time\n\n示例：\n\n- 默认搜索(搜全部）：https://search.bilibili.com/live?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83\n- 按最新开播顺序搜直播间：https://search.bilibili.com/live?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83&search_type=live_room&order=live_time\n\n### article\n\n支持的搜索规则包括：\n- 最新发布：&order=pubdate\n- 最多点击：&order=click\n- 最受欢迎：&order=attention\n- 最多评论：&order=scores\n\n示例：\n\n- 默认搜索：https://search.bilibili.com/article?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83\n- 最多评论：https://search.bilibili.com/article?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83&order=scores\n\n### upuser\n\n支持的搜索规则包括：\n- 粉丝数由高到低：&order=fans\n- 粉丝数由低到高：&order=fans&order_sort=1\n- 会员等级由高到低：&order=level\n- 会员等级由低到高：&order=level&order_sort=1\n\n示例：\n\n- 默认搜索：https://search.bilibili.com/upuser?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83\n- 粉丝数由高到低：https://search.bilibili.com/upuser?keyword=%E8%8D%AF%E5%B1%8B%E5%B0%91%E5%A5%B3%E7%9A%84%E5%91%A2%E5%96%83&order=fans\n\n## 抖音(douyin，简称dy)：\n\n- 综合搜索：https://www.douyin.com/search/{keyword}?type=general\n- 视频搜索：https://www.douyin.com/search/{keyword}?type=video\n- 用户搜索：https://www.douyin.com/search/{keyword}?type=user\n- 直播搜索：https://www.douyin.com/search/{keyword}?type=live\n\n多 keyword，中间用 %20 连接，如： https://www.douyin.com/search/wiseflow%20%E8%B4%9F%E9%9D%A2\n\ntype 缺省为综合搜索\n\n如果涉及到搜索结果排序或者筛选，必须通过网页交互进行\n\n## 微博（weibo，简称 wb）：\n\n- 综合搜索：https://s.weibo.com/weibo/{keyword}\n- 实时搜索（最新发布）：https://s.weibo.com/realtime?q={keyword}\n- 搜索用户：https://s.weibo.com/user?q={keyword}\n- 搜索话题：https://s.weibo.com/topic?q={keyword}\n\n## 小红书（xiaohongshu，简称 xhs，又称 红薯）：\n\nhttps://www.xiaohongshu.com/search_result?keyword={keyword}&source=web_explore_feed\n\n小红书平台具体的搜索频道、筛选条件、排序要求等都必须通过网页交互实现\n\n## 知乎（zhihu):\n\n- 综合：https://www.zhihu.com/search?type=content&q={keyword}\n- 用户（找人）：https://www.zhihu.com/search?q={keyword}&type=people\n- 论文：https://www.zhihu.com/search?q={keyword}&type=scholar\n- 专栏：https://www.zhihu.com/search?q={keyword}&type=column\n- 电子书：https://www.zhihu.com/search?q={keyword}&type=publication\n- 圈子：https://www.zhihu.com/search?q={keyword}&type=ring\n- 话题：https://www.zhihu.com/search?q={keyword}&type=topic\n- 视频：https://www.zhihu.com/search?q={keyword}&type=zvideo\n\n多 keyword，中间用 %20 连接，如： https://www.zhihu.com/search?type=zvideo&q=wiseflow%20%E4%BB%98%E8%B4%B9\n\n其中如下支持通过 url 构造filter 条件或者排序：\n\n### 综和\n\n基础：https://www.zhihu.com/search?type=content&q={keyword}\n\n- 条件：\n  - 只看回答，url 后加：&type=content&vertical=answer\n  - 只看文章，url 后加：&type=content&vertical=article\n  - 只看视频，url 后加：&type=content&vertical=zvideo\n\n- 排序：\n  - 最多赞同，url 后加：&sort=upvoted_count\n  - 最新发布，url 后加：&sort=created_time\n\n- 时间限制：\n  - 一天内，url 后加：&time_interval=a_day\n  - 一周内，url 后加：&time_interval=a_week\n  - 一月内，url 后加：&time_interval=a_month\n  - 三月内，url 后加：&time_interval=three_months\n  - 半年内，url 后加：&time_interval=half_a_year\n  - 一年内，url 后加：&time_interval=a_year\n\n以上都可以灵活组合：比如：https://www.zhihu.com/search?q=wiseflow%20%E4%BB%98%E8%B4%B9&sort=created_time&time_interval=a_month&type=content&vertical=article\n\n## twitter（X，推特）\n\n- TOP：https://x.com/search?q={keyword}\n- Latest：https://x.com/search?q={keyword}&f=live\n- People（找人）：https://x.com/search?q={keyword}&f=user\n- Media：https://x.com/search?q={keyword}&f=media\n- Lists：https://x.com/search?q={keyword}&f=list\n\n多 keyword，中间用 %20 连接，如：https://x.com/search?q=wiseflow%20%E8%BD%AF%E4%BB%B6&src=typed_query&f=list\n\n均可叠加 Near You 选项，后面加 &lf=on， 如：https://x.com/search?q=wiseflow&f=live&lf=on\n\n## facebook（FB，脸书）\n\n- ALL：https://www.facebook.com/search/top/?q={keyword}\n- People（找人）：https://www.facebook.com/search/people/?q={keyword}\n- pages: https://www.facebook.com/search/pages?q={keyword}\n- groups: https://www.facebook.com/search/groups?q={keyword}\n- events: https://www.facebook.com/search/events?q={keyword}\n\n多 keyword，中间用 %20 连接，如：https://www.facebook.com/search/top/?q=jinchen%20%E4%BD%8F%E5%8F%8B\n\n搜索条件等都需要通过网页交互实现\n\n## github\n\n- Repositories：https://github.com/search?q={keyword}&type=repositories\n- Users：https://github.com/search?q={keyword}&type=users\n- Issues：https://github.com/search?q={keyword}&type=issues\n- Pull Requests：https://github.com/search?q={keyword}&type=pullrequests\n- Code：https://github.com/search?q={keyword}&type=code\n- discussions https://github.com/search?q={keyword}&type=discussions\n- Wikis: https://github.com/search?q={keyword}&type=wikis\n- topics: https://github.com/search?q={keyword}&type=topics\n\n多 keyword，中间用 + 连接，如：https://github.com/search?q=wiseflow+addon&type=topics\n\n### Repositories 支持的搜素条件：\n\n- most stars: &s=stars&o=desc\n- fewest stars: &s=stars&o=asc\n- most forks: &s=forks&o=desc\n- fewest forks: &s=forks&o=asc\n- recently updated: &s=updated&o=desc\n- latest recently updated: &s=updated&o=asc\n\n### users 支持的搜素条件：\n\n- most followers: &s=followers&o=desc\n- fewest followers: &s=followers&o=asc\n- most recently joined: &s=joined&o=desc\n- least recently joined: &s=joined&o=asc\n- most repositories: &s=repositories&o=desc\n- fewest repositories: &s=repositories&o=asc\n\nrepositories 和 user 搜索都支持添加 语言作为过滤，&l=HTML\n\n如：https://github.com/search?q=wiseflow+language%3AHTML&type=users&s=repositories&o=desc&l=HTML\n\n支持的语言过滤：HTML, CSS, JavaScript, Python, Ruby, Java, C++, PHP, Swift, Go, Kotlin, TypeScript, Rust, Scala, Haskell, Lua, Shell, Dockerfile, JSON, YAML, Markdown, SVG, \n"
  },
  {
    "path": "docs/more_powerful_search_skill/extra/arxiv.py",
    "content": "# SPDX-License-Identifier: AGPL-3.0-or-later\n\"\"\"arXiv is a free distribution service and an open-access archive for nearly\n2.4 million scholarly articles in the fields of physics, mathematics, computer\nscience, quantitative biology, quantitative finance, statistics, electrical\nengineering and systems science, and economics.\n\nThe engine uses the `arXiv API`_.\n\n.. _arXiv API: https://info.arxiv.org/help/api/user-manual.html\n\"\"\"\n\nimport typing as t\n\nfrom datetime import datetime\nfrom urllib.parse import urlencode\n\nfrom lxml import etree\nfrom lxml.etree import XPath\nfrom searx.utils import eval_xpath, eval_xpath_list, eval_xpath_getindex\nfrom searx.result_types import EngineResults\n\nif t.TYPE_CHECKING:\n    from searx.extended_types import SXNG_Response\n    from searx.search.processors import OnlineParams\n\nabout = {\n    \"website\": \"https://arxiv.org\",\n    \"wikidata_id\": \"Q118398\",\n    \"official_api_documentation\": \"https://info.arxiv.org/help/api/user-manual.html\",\n    \"use_official_api\": True,\n    \"require_api_key\": False,\n    \"results\": \"XML-RSS\",\n}\n\ncategories = [\"science\", \"scientific publications\"]\npaging = True\narxiv_max_results = 10\narxiv_search_prefix = \"all\"\n\"\"\"Search fields, for more details see, `Details of Query Construction`_.\n\n.. _Details of Query Construction:\n   https://info.arxiv.org/help/api/user-manual.html#51-details-of-query-construction\n\"\"\"\n\nbase_url = \"https://export.arxiv.org/api/query\"\n\"\"\"`arXiv API`_ URL, for more details see Query-Interface_\n\n.. _Query-Interface: https://info.arxiv.org/help/api/user-manual.html#_query_interface\n\"\"\"\n\narxiv_namespaces = {\n    \"atom\": \"http://www.w3.org/2005/Atom\",\n    \"arxiv\": \"http://arxiv.org/schemas/atom\",\n}\nxpath_entry = XPath(\"//atom:entry\", namespaces=arxiv_namespaces)\nxpath_title = XPath(\".//atom:title\", namespaces=arxiv_namespaces)\nxpath_id = XPath(\".//atom:id\", namespaces=arxiv_namespaces)\nxpath_summary = XPath(\".//atom:summary\", namespaces=arxiv_namespaces)\nxpath_author_name = XPath(\".//atom:author/atom:name\", namespaces=arxiv_namespaces)\nxpath_doi = XPath(\".//arxiv:doi\", namespaces=arxiv_namespaces)\nxpath_pdf = XPath(\".//atom:link[@title='pdf']\", namespaces=arxiv_namespaces)\nxpath_published = XPath(\".//atom:published\", namespaces=arxiv_namespaces)\nxpath_journal = XPath(\".//arxiv:journal_ref\", namespaces=arxiv_namespaces)\nxpath_category = XPath(\".//atom:category/@term\", namespaces=arxiv_namespaces)\nxpath_comment = XPath(\"./arxiv:comment\", namespaces=arxiv_namespaces)\n\n\ndef request(query: str, params: \"OnlineParams\") -> None:\n\n    args = {\n        \"search_query\": f\"{arxiv_search_prefix}:{query}\",\n        \"start\": (params[\"pageno\"] - 1) * arxiv_max_results,\n        \"max_results\": arxiv_max_results,\n    }\n    params[\"url\"] = f\"{base_url}?{urlencode(args)}\"\n\n\ndef response(resp: \"SXNG_Response\") -> EngineResults:\n\n    res = EngineResults()\n\n    dom = etree.fromstring(resp.content)\n    for entry in eval_xpath_list(dom, xpath_entry):\n\n        title: str = eval_xpath_getindex(entry, xpath_title, 0).text\n\n        url: str = eval_xpath_getindex(entry, xpath_id, 0).text\n        abstract: str = eval_xpath_getindex(entry, xpath_summary, 0).text\n\n        authors: list[str] = [author.text for author in eval_xpath_list(entry, xpath_author_name)]\n\n        #  doi\n        doi_element = eval_xpath_getindex(entry, xpath_doi, 0, default=None)\n        doi: str = \"\" if doi_element is None else doi_element.text\n\n        # pdf\n        pdf_element = eval_xpath_getindex(entry, xpath_pdf, 0, default=None)\n        pdf_url: str = \"\" if pdf_element is None else pdf_element.attrib.get(\"href\")\n\n        # journal\n        journal_element = eval_xpath_getindex(entry, xpath_journal, 0, default=None)\n        journal: str = \"\" if journal_element is None else journal_element.text\n\n        # tags\n        tag_elements = eval_xpath(entry, xpath_category)\n        tags: list[str] = [str(tag) for tag in tag_elements]\n\n        # comments\n        comments_elements = eval_xpath_getindex(entry, xpath_comment, 0, default=None)\n        comments: str = \"\" if comments_elements is None else comments_elements.text\n\n        publishedDate = datetime.strptime(eval_xpath_getindex(entry, xpath_published, 0).text, \"%Y-%m-%dT%H:%M:%SZ\")\n\n        res.add(\n            res.types.Paper(\n                url=url,\n                title=title,\n                publishedDate=publishedDate,\n                content=abstract,\n                doi=doi,\n                authors=authors,\n                journal=journal,\n                tags=tags,\n                comments=comments,\n                pdf_url=pdf_url,\n            )\n        )\n\n    return res\n"
  },
  {
    "path": "docs/more_powerful_search_skill/extra/baidu.py",
    "content": "# SPDX-License-Identifier: AGPL-3.0-or-later\n\"\"\"Baidu_\n\n.. _Baidu: https://www.baidu.com\n\"\"\"\n\n# There exits a https://github.com/ohblue/baidu-serp-api/\n# but we don't use it here (may we can learn from).\n\nfrom urllib.parse import urlencode\nfrom datetime import datetime\nfrom html import unescape\nimport time\nimport json\n\nfrom searx.exceptions import SearxEngineAPIException, SearxEngineCaptchaException\nfrom searx.utils import html_to_text\n\nabout = {\n    \"website\": \"https://www.baidu.com\",\n    \"wikidata_id\": \"Q14772\",\n    \"official_api_documentation\": None,\n    \"use_official_api\": False,\n    \"require_api_key\": False,\n    \"results\": \"JSON\",\n    \"language\": \"zh\",\n}\n\npaging = True\ncategories = []\nresults_per_page = 10\n\nbaidu_category = 'general'\n\ntime_range_support = True\ntime_range_dict = {\"day\": 86400, \"week\": 604800, \"month\": 2592000, \"year\": 31536000}\n\n\ndef init(_):\n    if baidu_category not in ('general', 'images', 'it'):\n        raise SearxEngineAPIException(f\"Unsupported category: {baidu_category}\")\n\n\ndef request(query, params):\n    page_num = params[\"pageno\"]\n\n    category_config = {\n        'general': {\n            'endpoint': 'https://www.baidu.com/s',\n            'params': {\n                \"wd\": query,\n                \"rn\": results_per_page,\n                \"pn\": (page_num - 1) * results_per_page,\n                \"tn\": \"json\",\n            },\n        },\n        'images': {\n            'endpoint': 'https://image.baidu.com/search/acjson',\n            'params': {\n                \"word\": query,\n                \"rn\": results_per_page,\n                \"pn\": (page_num - 1) * results_per_page,\n                \"tn\": \"resultjson_com\",\n            },\n        },\n        'it': {\n            'endpoint': 'https://kaifa.baidu.com/rest/v1/search',\n            'params': {\n                \"wd\": query,\n                \"pageSize\": results_per_page,\n                \"pageNum\": page_num,\n                \"paramList\": f\"page_num={page_num},page_size={results_per_page}\",\n                \"position\": 0,\n            },\n        },\n    }\n\n    query_params = category_config[baidu_category]['params']\n    query_url = category_config[baidu_category]['endpoint']\n\n    if params.get(\"time_range\") in time_range_dict:\n        now = int(time.time())\n        past = now - time_range_dict[params[\"time_range\"]]\n\n        if baidu_category == 'general':\n            query_params[\"gpc\"] = f\"stf={past},{now}|stftype=1\"\n\n        if baidu_category == 'it':\n            query_params[\"paramList\"] += f\",timestamp_range={past}-{now}\"\n\n    params[\"url\"] = f\"{query_url}?{urlencode(query_params)}\"\n    params[\"allow_redirects\"] = False\n    return params\n\n\ndef response(resp):\n    # Detect Baidu Captcha, it will redirect to wappass.baidu.com\n    if 'wappass.baidu.com/static/captcha' in resp.headers.get('Location', ''):\n        raise SearxEngineCaptchaException()\n\n    text = resp.text\n    if baidu_category == 'images':\n        # baidu's JSON encoder wrongly quotes / and ' characters by \\\\ and \\'\n        text = text.replace(r\"\\/\", \"/\").replace(r\"\\'\", \"'\")\n    data = json.loads(text, strict=False)\n    parsers = {'general': parse_general, 'images': parse_images, 'it': parse_it}\n\n    return parsers[baidu_category](data)\n\n\ndef parse_general(data):\n    results = []\n    if not data.get(\"feed\", {}).get(\"entry\"):\n        raise SearxEngineAPIException(\"Invalid response\")\n\n    for entry in data[\"feed\"][\"entry\"]:\n        if not entry.get(\"title\") or not entry.get(\"url\"):\n            continue\n\n        published_date = None\n        if entry.get(\"time\"):\n            try:\n                published_date = datetime.fromtimestamp(entry[\"time\"])\n            except (ValueError, TypeError):\n                published_date = None\n\n        # title and content sometimes containing characters such as &amp; &#39; &quot; etc...\n        title = unescape(entry[\"title\"])\n        content = unescape(entry.get(\"abs\", \"\"))\n\n        results.append(\n            {\n                \"title\": title,\n                \"url\": entry[\"url\"],\n                \"content\": content,\n                \"publishedDate\": published_date,\n            }\n        )\n    return results\n\n\ndef parse_images(data):\n    results = []\n    if \"data\" in data:\n        for item in data[\"data\"]:\n            if not item:\n                # the last item in the JSON list is empty, the JSON string ends with \"}, {}]\"\n                continue\n            replace_url = item.get(\"replaceUrl\", [{}])[0]\n            width = item.get(\"width\")\n            height = item.get(\"height\")\n            img_date = item.get(\"bdImgnewsDate\")\n            publishedDate = None\n            if img_date:\n                publishedDate = datetime.strptime(img_date, \"%Y-%m-%d %H:%M\")\n            results.append(\n                {\n                    \"template\": \"images.html\",\n                    \"url\": replace_url.get(\"FromURL\"),\n                    \"thumbnail_src\": item.get(\"thumbURL\"),\n                    \"img_src\": replace_url.get(\"ObjURL\"),\n                    \"title\": html_to_text(item.get(\"fromPageTitle\")),\n                    \"source\": item.get(\"fromURLHost\"),\n                    \"resolution\": f\"{width} x {height}\",\n                    \"img_format\": item.get(\"type\"),\n                    \"filesize\": item.get(\"filesize\"),\n                    \"publishedDate\": publishedDate,\n                }\n            )\n    return results\n\n\ndef parse_it(data):\n    results = []\n    if not data.get(\"data\", {}).get(\"documents\", {}).get(\"data\"):\n        raise SearxEngineAPIException(\"Invalid response\")\n\n    for entry in data[\"data\"][\"documents\"][\"data\"]:\n        results.append(\n            {\n                'title': entry[\"techDocDigest\"][\"title\"],\n                'url': entry[\"techDocDigest\"][\"url\"],\n                'content': entry[\"techDocDigest\"][\"summary\"],\n            }\n        )\n    return results\n"
  },
  {
    "path": "docs/more_powerful_search_skill/extra/bing.py",
    "content": "# SPDX-License-Identifier: AGPL-3.0-or-later\n\"\"\"This is the implementation of the Bing-WEB engine. Some of this\nimplementations are shared by other engines:\n\n- :ref:`bing images engine`\n- :ref:`bing news engine`\n- :ref:`bing videos engine`\n\nOn the `preference page`_ Bing offers a lot of languages an regions (see section\nLANGUAGE and COUNTRY/REGION).  The Language is the language of the UI, we need\nin SearXNG to get the translations of data such as *\"published last week\"*.\n\nThere is a description of the official search-APIs_, unfortunately this is not\nthe API we can use or that bing itself would use.  You can look up some things\nin the API to get a better picture of bing, but the value specifications like\nthe market codes are usually outdated or at least no longer used by bing itself.\n\nThe market codes have been harmonized and are identical for web, video and\nimages.  The news area has also been harmonized with the other categories.  Only\npolitical adjustments still seem to be made -- for example, there is no news\ncategory for the Chinese market.\n\n.. _preference page: https://www.bing.com/account/general\n.. _search-APIs: https://learn.microsoft.com/en-us/bing/search-apis/\n\n\"\"\"\n# pylint: disable=too-many-branches, invalid-name\n\nimport base64\nimport re\nimport time\nfrom urllib.parse import parse_qs, urlencode, urlparse\n\nimport babel\nimport babel.languages\nfrom lxml import html\n\nfrom searx.enginelib.traits import EngineTraits\nfrom searx.exceptions import SearxEngineAPIException\nfrom searx.locales import language_tag, region_tag\nfrom searx.utils import eval_xpath, eval_xpath_getindex, eval_xpath_list, extract_text\n\nabout = {\n    \"website\": \"https://www.bing.com\",\n    \"wikidata_id\": \"Q182496\",\n    \"official_api_documentation\": \"https://www.microsoft.com/en-us/bing/apis/bing-web-search-api\",\n    \"use_official_api\": False,\n    \"require_api_key\": False,\n    \"results\": \"HTML\",\n}\n\n# engine dependent config\ncategories = [\"general\", \"web\"]\npaging = True\nmax_page = 200\n\"\"\"200 pages maximum (``&first=1991``)\"\"\"\n\ntime_range_support = True\nsafesearch = True\n\"\"\"Bing results are always SFW.  To get NSFW links from bing some age\nverification by a cookie is needed / thats not possible in SearXNG.\n\"\"\"\n\nbase_url = \"https://www.bing.com/search\"\n\"\"\"Bing (Web) search URL\"\"\"\n\n\ndef _page_offset(pageno):\n    return (int(pageno) - 1) * 10 + 1\n\n\ndef set_bing_cookies(params, engine_language, engine_region):\n    params[\"cookies\"][\"_EDGE_CD\"] = f\"m={engine_region}&u={engine_language}\"\n    params[\"cookies\"][\"_EDGE_S\"] = f\"mkt={engine_region}&ui={engine_language}\"\n    logger.debug(\"bing cookies: %s\", params[\"cookies\"])\n\n\ndef request(query, params):\n    \"\"\"Assemble a Bing-Web request.\"\"\"\n\n    engine_region = traits.get_region(params[\"searxng_locale\"], traits.all_locale)  # type: ignore\n    engine_language = traits.get_language(params[\"searxng_locale\"], \"en\")  # type: ignore\n    set_bing_cookies(params, engine_language, engine_region)\n\n    page = params.get(\"pageno\", 1)\n    query_params = {\n        \"q\": query,\n        # if arg 'pq' is missed, sometimes on page 4 we get results from page 1,\n        # don't ask why it is only sometimes / its M$ and they have never been\n        # deterministic ;)\n        \"pq\": query,\n    }\n\n    # To get correct page, arg first and this arg FORM is needed, the value PERE\n    # is on page 2, on page 3 its PERE1 and on page 4 its PERE2 .. and so forth.\n    # The 'first' arg should never send on page 1.\n\n    if page > 1:\n        query_params[\"first\"] = _page_offset(page)  # see also arg FORM\n    if page == 2:\n        query_params[\"FORM\"] = \"PERE\"\n    elif page > 2:\n        query_params[\"FORM\"] = \"PERE%s\" % (page - 2)\n\n    params[\"url\"] = f\"{base_url}?{urlencode(query_params)}\"\n\n    if params.get(\"time_range\"):\n        unix_day = int(time.time() / 86400)\n        time_ranges = {\n            \"day\": \"1\",\n            \"week\": \"2\",\n            \"month\": \"3\",\n            \"year\": f\"5_{unix_day - 365}_{unix_day}\",\n        }\n        params[\"url\"] += f'&filters=ex1:\"ez{time_ranges[params[\"time_range\"]]}\"'\n\n    # in some regions where geoblocking is employed (e.g. China),\n    # www.bing.com redirects to the regional version of Bing\n    params[\"allow_redirects\"] = True\n\n    return params\n\n\ndef response(resp):\n    # pylint: disable=too-many-locals\n\n    results = []\n    result_len = 0\n\n    dom = html.fromstring(resp.text)\n\n    # parse results again if nothing is found yet\n\n    for result in eval_xpath_list(dom, '//ol[@id=\"b_results\"]/li[contains(@class, \"b_algo\")]'):\n        link = eval_xpath_getindex(result, \".//h2/a\", 0, None)\n        if link is None:\n            continue\n        url = link.attrib.get(\"href\")\n        title = extract_text(link)\n\n        content = eval_xpath(result, \".//p\")\n        for p in content:\n            # Make sure that the element is free of:\n            #  <span class=\"algoSlug_icon\" # data-priority=\"2\">Web</span>\n            for e in p.xpath('.//span[@class=\"algoSlug_icon\"]'):\n                e.getparent().remove(e)\n        content = extract_text(content)\n\n        # get the real URL\n        if url.startswith(\"https://www.bing.com/ck/a?\"):\n            # get the first value of u parameter\n            url_query = urlparse(url).query\n            parsed_url_query = parse_qs(url_query)\n            param_u = parsed_url_query[\"u\"][0]\n            # remove \"a1\" in front\n            encoded_url = param_u[2:]\n            # add padding\n            encoded_url = encoded_url + \"=\" * (-len(encoded_url) % 4)\n            # decode base64 encoded URL\n            url = base64.urlsafe_b64decode(encoded_url).decode()\n\n        # append result\n        results.append({\"url\": url, \"title\": title, \"content\": content})\n\n    # get number_of_results\n    if results:\n        result_len_container = \"\".join(eval_xpath(dom, '//span[@class=\"sb_count\"]//text()'))\n        if \"-\" in result_len_container:\n            start_str, result_len_container = re.split(r\"-\\d+\", result_len_container)\n            start = int(start_str)\n        else:\n            start = 1\n\n        result_len_container = re.sub(\"[^0-9]\", \"\", result_len_container)\n        if len(result_len_container) > 0:\n            result_len = int(result_len_container)\n\n        expected_start = _page_offset(resp.search_params.get(\"pageno\", 1))\n\n        if expected_start != start:\n            if expected_start > result_len:\n                # Avoid reading more results than available.\n                # For example, if there is 100 results from some search and we try to get results from 120 to 130,\n                # Bing will send back the results from 0 to 10 and no error.\n                # If we compare results count with the first parameter of the request we can avoid this \"invalid\"\n                # results.\n                return []\n\n            # Sometimes Bing will send back the first result page instead of the requested page as a rate limiting\n            # measure.\n            msg = f\"Expected results to start at {expected_start}, but got results starting at {start}\"\n            raise SearxEngineAPIException(msg)\n\n    results.append({\"number_of_results\": result_len})\n    return results\n\n\ndef fetch_traits(engine_traits: EngineTraits):\n    \"\"\"Fetch languages and regions from Bing-Web.\"\"\"\n    # pylint: disable=import-outside-toplevel\n\n    from searx.network import get  # see https://github.com/searxng/searxng/issues/762\n    from searx.utils import gen_useragent\n\n    headers = {\n        \"User-Agent\": gen_useragent(),\n        \"Accept\": \"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\",\n        \"Accept-Language\": \"en-US;q=0.5,en;q=0.3\",\n        \"DNT\": \"1\",\n        \"Connection\": \"keep-alive\",\n        \"Upgrade-Insecure-Requests\": \"1\",\n        \"Sec-GPC\": \"1\",\n        \"Cache-Control\": \"max-age=0\",\n    }\n\n    resp = get(\"https://www.bing.com/account/general\", headers=headers, timeout=5)\n    if not resp.ok:\n        raise RuntimeError(\"Response from Bing is not OK.\")\n\n    dom = html.fromstring(resp.text)\n\n    # languages\n\n    engine_traits.languages[\"zh\"] = \"zh-hans\"\n\n    map_lang = {\"prs\": \"fa-AF\", \"en\": \"en-us\"}\n    bing_ui_lang_map = {\n        # HINT: this list probably needs to be supplemented\n        \"en\": \"us\",  # en --> en-us\n        \"da\": \"dk\",  # da --> da-dk\n    }\n\n    for href in eval_xpath(dom, '//div[@id=\"language-section-content\"]//div[@class=\"languageItem\"]/a/@href'):\n        eng_lang = parse_qs(urlparse(href).query)[\"setlang\"][0]\n        babel_lang = map_lang.get(eng_lang, eng_lang)\n        try:\n            sxng_tag = language_tag(babel.Locale.parse(babel_lang.replace(\"-\", \"_\")))\n        except babel.UnknownLocaleError:\n            print(\"ERROR: language (%s) is unknown by babel\" % (babel_lang))\n            continue\n        # Language (e.g. 'en' or 'de') from https://www.bing.com/account/general\n        # is converted by bing to 'en-us' or 'de-de'.  But only if there is not\n        # already a '-' delemitter in the language.  For instance 'pt-PT' -->\n        # 'pt-pt' and 'pt-br' --> 'pt-br'\n        bing_ui_lang = eng_lang.lower()\n        if \"-\" not in bing_ui_lang:\n            bing_ui_lang = bing_ui_lang + \"-\" + bing_ui_lang_map.get(bing_ui_lang, bing_ui_lang)\n\n        conflict = engine_traits.languages.get(sxng_tag)\n        if conflict:\n            if conflict != bing_ui_lang:\n                print(f\"CONFLICT: babel {sxng_tag} --> {conflict}, {bing_ui_lang}\")\n            continue\n        engine_traits.languages[sxng_tag] = bing_ui_lang\n\n    # regions (aka \"market codes\")\n\n    engine_traits.regions[\"zh-CN\"] = \"zh-cn\"\n\n    map_market_codes = {\n        \"zh-hk\": \"en-hk\",  # not sure why, but at M$ this is the market code for Hongkong\n    }\n    for href in eval_xpath(dom, '//div[@id=\"region-section-content\"]//div[@class=\"regionItem\"]/a/@href'):\n        cc_tag = parse_qs(urlparse(href).query)[\"cc\"][0]\n        if cc_tag == \"clear\":\n            engine_traits.all_locale = cc_tag\n            continue\n\n        # add market codes from official languages of the country ..\n        for lang_tag in babel.languages.get_official_languages(cc_tag, de_facto=True):\n            if lang_tag not in engine_traits.languages.keys():\n                # print(\"ignore lang: %s <-- %s\" % (cc_tag, lang_tag))\n                continue\n            lang_tag = lang_tag.split(\"_\")[0]  # zh_Hant --> zh\n            market_code = f\"{lang_tag}-{cc_tag}\"  # zh-tw\n\n            market_code = map_market_codes.get(market_code, market_code)\n            sxng_tag = region_tag(babel.Locale.parse(\"%s_%s\" % (lang_tag, cc_tag.upper())))\n            conflict = engine_traits.regions.get(sxng_tag)\n            if conflict:\n                if conflict != market_code:\n                    print(\"CONFLICT: babel %s --> %s, %s\" % (sxng_tag, conflict, market_code))\n                    continue\n            engine_traits.regions[sxng_tag] = market_code\n"
  },
  {
    "path": "docs/more_powerful_search_skill/extra/bing_images.py",
    "content": "# SPDX-License-Identifier: AGPL-3.0-or-later\n\"\"\"Bing-Images: description see :py:obj:`searx.engines.bing`.\"\"\"\n# pylint: disable=invalid-name\nimport json\nfrom urllib.parse import urlencode\n\nfrom lxml import html\n\nfrom searx.engines.bing import set_bing_cookies\nfrom searx.engines.bing import fetch_traits  # pylint: disable=unused-import\n\n# about\nabout = {\n    \"website\": 'https://www.bing.com/images',\n    \"wikidata_id\": 'Q182496',\n    \"official_api_documentation\": 'https://www.microsoft.com/en-us/bing/apis/bing-image-search-api',\n    \"use_official_api\": False,\n    \"require_api_key\": False,\n    \"results\": 'HTML',\n}\n\n# engine dependent config\ncategories = ['images', 'web']\npaging = True\nsafesearch = True\ntime_range_support = True\n\nbase_url = 'https://www.bing.com/images/async'\n\"\"\"Bing (Images) search URL\"\"\"\n\ntime_map = {\n    'day': 60 * 24,\n    'week': 60 * 24 * 7,\n    'month': 60 * 24 * 31,\n    'year': 60 * 24 * 365,\n}\n\n\ndef request(query, params):\n    \"\"\"Assemble a Bing-Image request.\"\"\"\n\n    engine_region = traits.get_region(params['searxng_locale'], traits.all_locale)  # type: ignore\n    engine_language = traits.get_language(params['searxng_locale'], 'en')  # type: ignore\n    set_bing_cookies(params, engine_language, engine_region)\n\n    # build URL query\n    # - example: https://www.bing.com/images/async?q=foo&async=content&first=1&count=35\n    query_params = {\n        'q': query,\n        'async': '1',\n        # to simplify the page count lets use the default of 35 images per page\n        'first': (int(params.get('pageno', 1)) - 1) * 35 + 1,\n        'count': 35,\n    }\n\n    # time range\n    # - example: one year (525600 minutes) 'qft=+filterui:age-lt525600'\n\n    if params['time_range']:\n        query_params['qft'] = 'filterui:age-lt%s' % time_map[params['time_range']]\n\n    params['url'] = base_url + '?' + urlencode(query_params)\n\n    return params\n\n\ndef response(resp):\n    \"\"\"Get response from Bing-Images\"\"\"\n\n    results = []\n    dom = html.fromstring(resp.text)\n\n    for result in dom.xpath('//ul[contains(@class, \"dgControl_list\")]/li'):\n\n        metadata = result.xpath('.//a[@class=\"iusc\"]/@m')\n        if not metadata:\n            continue\n\n        metadata = json.loads(result.xpath('.//a[@class=\"iusc\"]/@m')[0])\n        title = ' '.join(result.xpath('.//div[@class=\"infnmpt\"]//a/text()')).strip()\n        img_format = ' '.join(result.xpath('.//div[@class=\"imgpt\"]/div/span/text()')).strip().split(\" · \")\n        source = ' '.join(result.xpath('.//div[@class=\"imgpt\"]//div[@class=\"lnkw\"]//a/text()')).strip()\n        results.append(\n            {\n                'template': 'images.html',\n                'url': metadata['purl'],\n                'thumbnail_src': metadata['turl'],\n                'img_src': metadata['murl'],\n                'content': metadata.get('desc'),\n                'title': title,\n                'source': source,\n                'resolution': img_format[0],\n                'img_format': img_format[1] if len(img_format) >= 2 else None,\n            }\n        )\n    return results\n"
  },
  {
    "path": "docs/more_powerful_search_skill/extra/bing_news.py",
    "content": "# SPDX-License-Identifier: AGPL-3.0-or-later\n\"\"\"Bing-News: description see :py:obj:`searx.engines.bing`.\n\n.. hint::\n\n   Bing News is *different* in some ways!\n\n\"\"\"\n\n# pylint: disable=invalid-name\n\nfrom urllib.parse import urlencode\n\nfrom lxml import html\n\nfrom searx.utils import eval_xpath, extract_text, eval_xpath_list, eval_xpath_getindex\nfrom searx.enginelib.traits import EngineTraits\nfrom searx.engines.bing import set_bing_cookies\n\n# about\nabout = {\n    \"website\": 'https://www.bing.com/news',\n    \"wikidata_id\": 'Q2878637',\n    \"official_api_documentation\": 'https://www.microsoft.com/en-us/bing/apis/bing-news-search-api',\n    \"use_official_api\": False,\n    \"require_api_key\": False,\n    \"results\": 'RSS',\n}\n\n# engine dependent config\ncategories = ['news']\npaging = True\n\"\"\"If go through the pages and there are actually no new results for another\npage, then bing returns the results from the last page again.\"\"\"\n\ntime_range_support = True\ntime_map = {\n    'day': 'interval=\"4\"',\n    'week': 'interval=\"7\"',\n    'month': 'interval=\"9\"',\n}\n\"\"\"A string '4' means *last hour*.  We use *last hour* for ``day`` here since the\ndifference of *last day* and *last week* in the result list is just marginally.\nBing does not have news range ``year`` / we use ``month`` instead.\"\"\"\n\nbase_url = 'https://www.bing.com/news/infinitescrollajax'\n\"\"\"Bing (News) search URL\"\"\"\n\n\ndef request(query, params):\n    \"\"\"Assemble a Bing-News request.\"\"\"\n\n    engine_region = traits.get_region(params['searxng_locale'], traits.all_locale)  # type: ignore\n    engine_language = traits.get_language(params['searxng_locale'], 'en')  # type: ignore\n    set_bing_cookies(params, engine_language, engine_region)\n\n    # build URL query\n    #\n    # example: https://www.bing.com/news/infinitescrollajax?q=london&first=1\n\n    page = int(params.get('pageno', 1)) - 1\n    query_params = {\n        'q': query,\n        'InfiniteScroll': 1,\n        # to simplify the page count lets use the default of 10 images per page\n        'first': page * 10 + 1,\n        'SFX': page,\n        'form': 'PTFTNR',\n        'setlang': engine_region.split('-')[0],\n        'cc': engine_region.split('-')[-1],\n    }\n\n    if params['time_range']:\n        query_params['qft'] = time_map.get(params['time_range'], 'interval=\"9\"')\n\n    params['url'] = base_url + '?' + urlencode(query_params)\n\n    return params\n\n\ndef response(resp):\n    \"\"\"Get response from Bing-Video\"\"\"\n    results = []\n\n    if not resp.ok or not resp.text:\n        return results\n\n    dom = html.fromstring(resp.text)\n\n    for newsitem in eval_xpath_list(dom, '//div[contains(@class, \"newsitem\")]'):\n\n        link = eval_xpath_getindex(newsitem, './/a[@class=\"title\"]', 0, None)\n        if link is None:\n            continue\n        url = link.attrib.get('href')\n        title = extract_text(link)\n        content = extract_text(eval_xpath(newsitem, './/div[@class=\"snippet\"]'))\n\n        metadata = []\n        source = eval_xpath_getindex(newsitem, './/div[contains(@class, \"source\")]', 0, None)\n        if source is not None:\n            for item in (\n                eval_xpath_getindex(source, './/span[@aria-label]/@aria-label', 0, None),\n                # eval_xpath_getindex(source, './/a', 0, None),\n                # eval_xpath_getindex(source, './div/span', 3, None),\n                link.attrib.get('data-author'),\n            ):\n                if item is not None:\n                    t = extract_text(item)\n                    if t and t.strip():\n                        metadata.append(t.strip())\n        metadata = ' | '.join(metadata)\n\n        thumbnail = None\n        imagelink = eval_xpath_getindex(newsitem, './/a[@class=\"imagelink\"]//img', 0, None)\n        if imagelink is not None:\n            thumbnail = imagelink.attrib.get('src')\n            if not thumbnail.startswith(\"https://www.bing.com\"):\n                thumbnail = 'https://www.bing.com/' + thumbnail\n\n        results.append(\n            {\n                'url': url,\n                'title': title,\n                'content': content,\n                'thumbnail': thumbnail,\n                'metadata': metadata,\n            }\n        )\n\n    return results\n\n\ndef fetch_traits(engine_traits: EngineTraits):\n    \"\"\"Fetch languages and regions from Bing-News.\"\"\"\n    # pylint: disable=import-outside-toplevel\n\n    from searx.engines.bing import fetch_traits as _f\n\n    _f(engine_traits)\n\n    # fix market codes not known by bing news:\n\n    # In bing the market code 'zh-cn' exists, but there is no 'news' category in\n    # bing for this market.  Alternatively we use the the market code from Honk\n    # Kong.  Even if this is not correct, it is better than having no hits at\n    # all, or sending false queries to bing that could raise the suspicion of a\n    # bot.\n\n    # HINT: 'en-hk' is the region code it does not indicate the language en!!\n    engine_traits.regions['zh-CN'] = 'en-hk'\n"
  },
  {
    "path": "docs/more_powerful_search_skill/extra/flickr.py",
    "content": "# SPDX-License-Identifier: AGPL-3.0-or-later\n\"\"\"\nFlickr (Images)\n\nMore info on api-key : https://www.flickr.com/services/apps/create/\n\"\"\"\n\nfrom json import loads\nfrom urllib.parse import urlencode\n\n# about\nabout = {\n    \"website\": 'https://www.flickr.com',\n    \"wikidata_id\": 'Q103204',\n    \"official_api_documentation\": 'https://secure.flickr.com/services/api/flickr.photos.search.html',\n    \"use_official_api\": True,\n    \"require_api_key\": True,\n    \"results\": 'JSON',\n}\n\ncategories = ['images']\n\nnb_per_page = 15\npaging = True\napi_key = None\n\n\nurl = (\n    'https://api.flickr.com/services/rest/?method=flickr.photos.search'\n    + '&api_key={api_key}&{text}&sort=relevance'\n    + '&extras=description%2C+owner_name%2C+url_o%2C+url_n%2C+url_z'\n    + '&per_page={nb_per_page}&format=json&nojsoncallback=1&page={page}'\n)\nphoto_url = 'https://www.flickr.com/photos/{userid}/{photoid}'\n\npaging = True\n\n\ndef build_flickr_url(user_id, photo_id):\n    return photo_url.format(userid=user_id, photoid=photo_id)\n\n\ndef request(query, params):\n    params['url'] = url.format(\n        text=urlencode({'text': query}), api_key=api_key, nb_per_page=nb_per_page, page=params['pageno']\n    )\n    return params\n\n\ndef response(resp):\n    results = []\n\n    search_results = loads(resp.text)\n\n    # return empty array if there are no results\n    if 'photos' not in search_results:\n        return []\n\n    if 'photo' not in search_results['photos']:\n        return []\n\n    photos = search_results['photos']['photo']\n\n    # parse results\n    for photo in photos:\n        if 'url_o' in photo:\n            img_src = photo['url_o']\n        elif 'url_z' in photo:\n            img_src = photo['url_z']\n        else:\n            continue\n\n        # For a bigger thumbnail, keep only the url_z, not the url_n\n        if 'url_n' in photo:\n            thumbnail_src = photo['url_n']\n        elif 'url_z' in photo:\n            thumbnail_src = photo['url_z']\n        else:\n            thumbnail_src = img_src\n\n        # append result\n        results.append(\n            {\n                'url': build_flickr_url(photo['owner'], photo['id']),\n                'title': photo['title'],\n                'img_src': img_src,\n                'thumbnail_src': thumbnail_src,\n                'content': photo['description']['_content'],\n                'author': photo['ownername'],\n                'template': 'images.html',\n            }\n        )\n\n    # return results\n    return results\n"
  },
  {
    "path": "docs/more_powerful_search_skill/extra/quark.py",
    "content": "# SPDX-License-Identifier: AGPL-3.0-or-later\n\"\"\"Quark (Shenma) search engine for searxng\"\"\"\n\nfrom urllib.parse import urlencode\nfrom datetime import datetime\nimport re\nimport json\n\nfrom searx.utils import html_to_text\nfrom searx.exceptions import SearxEngineAPIException, SearxEngineCaptchaException\n\n# Metadata\nabout = {\n    \"website\": \"https://quark.sm.cn/\",\n    \"wikidata_id\": \"Q48816502\",\n    \"use_official_api\": False,\n    \"require_api_key\": False,\n    \"results\": \"HTML\",\n    \"language\": \"zh\",\n}\n\n# Engine Configuration\ncategories = []\npaging = True\nresults_per_page = 10\n\nquark_category = 'general'\n\ntime_range_support = True\ntime_range_dict = {'day': '4', 'week': '3', 'month': '2', 'year': '1'}\n\nCAPTCHA_PATTERN = r'\\{[^{]*?\"action\"\\s*:\\s*\"captcha\"\\s*,\\s*\"url\"\\s*:\\s*\"([^\"]+)\"[^{]*?\\}'\n\n\ndef is_alibaba_captcha(html):\n    \"\"\"\n    Detects if the response contains an Alibaba X5SEC CAPTCHA page.\n\n    Quark may return a CAPTCHA challenge after 9 requests in a short period.\n\n    Typically, the ban duration is around 15 minutes.\n    \"\"\"\n    return bool(re.search(CAPTCHA_PATTERN, html))\n\n\ndef init(_):\n    if quark_category not in ('general', 'images'):\n        raise SearxEngineAPIException(f\"Unsupported category: {quark_category}\")\n\n\ndef request(query, params):\n    page_num = params[\"pageno\"]\n\n    category_config = {\n        'general': {\n            'endpoint': 'https://quark.sm.cn/s',\n            'params': {\n                \"q\": query,\n                \"layout\": \"html\",\n                \"page\": page_num,\n            },\n        },\n        'images': {\n            'endpoint': 'https://vt.sm.cn/api/pic/list',\n            'params': {\n                \"query\": query,\n                \"limit\": results_per_page,\n                \"start\": (page_num - 1) * results_per_page,\n            },\n        },\n    }\n\n    query_params = category_config[quark_category]['params']\n    query_url = category_config[quark_category]['endpoint']\n\n    if time_range_dict.get(params['time_range']) and quark_category == 'general':\n        query_params[\"tl_request\"] = time_range_dict.get(params['time_range'])\n\n    params[\"url\"] = f\"{query_url}?{urlencode(query_params)}\"\n    return params\n\n\ndef response(resp):\n    results = []\n    text = resp.text\n\n    if is_alibaba_captcha(text):\n        raise SearxEngineCaptchaException(\n            suspended_time=900, message=\"Alibaba CAPTCHA detected. Please try again later.\"\n        )\n\n    if quark_category == 'images':\n        data = json.loads(text)\n        for item in data.get('data', {}).get('hit', {}).get('imgInfo', {}).get('item', []):\n            try:\n                published_date = datetime.fromtimestamp(int(item.get(\"publish_time\")))\n            except (ValueError, TypeError):\n                published_date = None\n\n            results.append(\n                {\n                    \"template\": \"images.html\",\n                    \"url\": item.get(\"imgUrl\"),\n                    \"thumbnail_src\": item.get(\"img\"),\n                    \"img_src\": item.get(\"bigPicUrl\"),\n                    \"title\": item.get(\"title\"),\n                    \"source\": item.get(\"site\"),\n                    \"resolution\": f\"{item['width']} x {item['height']}\",\n                    \"publishedDate\": published_date,\n                }\n            )\n\n    if quark_category == 'general':\n        # Quark returns a variety of different sc values on a single page, depending on the query type.\n        source_category_parsers = {\n            'addition': parse_addition,\n            'ai_page': parse_ai_page,\n            'baike_sc': parse_baike_sc,\n            'finance_shuidi': parse_finance_shuidi,\n            'kk_yidian_all': parse_kk_yidian_all,\n            'life_show_general_image': parse_life_show_general_image,\n            'med_struct': parse_med_struct,\n            'music_new_song': parse_music_new_song,\n            'nature_result': parse_nature_result,\n            'news_uchq': parse_news_uchq,\n            'ss_note': parse_ss_note,\n            # ss_kv, ss_pic, ss_text, ss_video, baike, structure_web_novel use the same struct as ss_doc\n            'ss_doc': parse_ss_doc,\n            'ss_kv': parse_ss_doc,\n            'ss_pic': parse_ss_doc,\n            'ss_text': parse_ss_doc,\n            'ss_video': parse_ss_doc,\n            'baike': parse_ss_doc,\n            'structure_web_novel': parse_ss_doc,\n            'travel_dest_overview': parse_travel_dest_overview,\n            'travel_ranking_list': parse_travel_ranking_list,\n        }\n\n        pattern = r'<script\\s+type=\"application/json\"\\s+id=\"s-data-[^\"]+\"\\s+data-used-by=\"hydrate\">(.*?)</script>'\n        matches = re.findall(pattern, text, re.DOTALL)\n\n        for match in matches:\n            data = json.loads(match)\n            initial_data = data.get('data', {}).get('initialData', {})\n            extra_data = data.get('extraData', {})\n\n            source_category = extra_data.get('sc')\n\n            parsers = source_category_parsers.get(source_category)\n            if parsers:\n                parsed_results = parsers(initial_data)\n                if isinstance(parsed_results, list):\n                    # Extend if the result is a list\n                    results.extend(parsed_results)\n                else:\n                    # Append if it's a single result\n                    results.append(parsed_results)\n\n    return results\n\n\ndef parse_addition(data):\n    return {\n        \"title\": html_to_text(data.get('title', {}).get('content')),\n        \"url\": data.get('source', {}).get('url'),\n        \"content\": html_to_text(data.get('summary', {}).get('content')),\n    }\n\n\ndef parse_ai_page(data):\n    results = []\n    for item in data.get('list', []):\n        content = (\n            \" | \".join(map(str, item.get('content', [])))\n            if isinstance(item.get('content'), list)\n            else str(item.get('content'))\n        )\n\n        try:\n            published_date = datetime.fromtimestamp(int(item.get('source', {}).get('time')))\n        except (ValueError, TypeError):\n            published_date = None\n\n        results.append(\n            {\n                \"title\": html_to_text(item.get('title')),\n                \"url\": item.get('url'),\n                \"content\": html_to_text(content),\n                \"publishedDate\": published_date,\n            }\n        )\n    return results\n\n\ndef parse_baike_sc(data):\n    return {\n        \"title\": html_to_text(data.get('data', {}).get('title')),\n        \"url\": data.get('data', {}).get('url'),\n        \"content\": html_to_text(data.get('data', {}).get('abstract')),\n        \"thumbnail\": data.get('data', {}).get('img').replace(\"http://\", \"https://\"),\n    }\n\n\ndef parse_finance_shuidi(data):\n    content = \" | \".join(\n        (\n            info\n            for info in [\n                data.get('establish_time'),\n                data.get('company_status'),\n                data.get('controled_type'),\n                data.get('company_type'),\n                data.get('capital'),\n                data.get('address'),\n                data.get('business_scope'),\n            ]\n            if info\n        )\n    )\n    return {\n        \"title\": html_to_text(data.get('company_name')),\n        \"url\": data.get('title_url'),\n        \"content\": html_to_text(content),\n    }\n\n\ndef parse_kk_yidian_all(data):\n    content_list = []\n    for section in data.get('list_container', []):\n        for item in section.get('list_container', []):\n            if 'dot_text' in item:\n                content_list.append(item['dot_text'])\n\n    return {\n        \"title\": html_to_text(data.get('title')),\n        \"url\": data.get('title_url'),\n        \"content\": html_to_text(' '.join(content_list)),\n    }\n\n\ndef parse_life_show_general_image(data):\n    results = []\n    for item in data.get('image', []):\n        try:\n            published_date = datetime.fromtimestamp(int(item.get(\"publish_time\")))\n        except (ValueError, TypeError):\n            published_date = None\n\n        results.append(\n            {\n                \"template\": \"images.html\",\n                \"url\": item.get(\"imgUrl\"),\n                \"thumbnail_src\": item.get(\"img\"),\n                \"img_src\": item.get(\"bigPicUrl\"),\n                \"title\": item.get(\"title\"),\n                \"source\": item.get(\"site\"),\n                \"resolution\": f\"{item['width']} x {item['height']}\",\n                \"publishedDate\": published_date,\n            }\n        )\n    return results\n\n\ndef parse_med_struct(data):\n    return {\n        \"title\": html_to_text(data.get('title')),\n        \"url\": data.get('message', {}).get('statistics', {}).get('nu'),\n        \"content\": html_to_text(data.get('message', {}).get('content_text')),\n        \"thumbnail\": data.get('message', {}).get('video_img').replace(\"http://\", \"https://\"),\n    }\n\n\ndef parse_music_new_song(data):\n    results = []\n    for item in data.get('hit3', []):\n        results.append(\n            {\n                \"title\": f\"{item['song_name']} | {item['song_singer']}\",\n                \"url\": item.get(\"play_url\"),\n                \"content\": html_to_text(item.get(\"lyrics\")),\n                \"thumbnail\": item.get(\"image_url\").replace(\"http://\", \"https://\"),\n            }\n        )\n    return results\n\n\ndef parse_nature_result(data):\n    return {\"title\": html_to_text(data.get('title')), \"url\": data.get('url'), \"content\": html_to_text(data.get('desc'))}\n\n\ndef parse_news_uchq(data):\n    results = []\n    for item in data.get('feed', []):\n        try:\n            published_date = datetime.strptime(item.get('time'), \"%Y-%m-%d\")\n        except (ValueError, TypeError):\n            # Sometime Quark will return non-standard format like \"1天前\", set published_date as None\n            published_date = None\n\n        results.append(\n            {\n                \"title\": html_to_text(item.get('title')),\n                \"url\": item.get('url'),\n                \"content\": html_to_text(item.get('summary')),\n                \"thumbnail\": item.get('image').replace(\"http://\", \"https://\"),\n                \"publishedDate\": published_date,\n            }\n        )\n    return results\n\n\ndef parse_ss_doc(data):\n    published_date = None\n    try:\n        timestamp = int(data.get('sourceProps', {}).get('time'))\n\n        # Sometime Quark will return 0, set published_date as None\n        if timestamp != 0:\n            published_date = datetime.fromtimestamp(timestamp)\n    except (ValueError, TypeError):\n        pass\n\n    try:\n        thumbnail = data.get('picListProps', [])[0].get('src').replace(\"http://\", \"https://\")\n    except (ValueError, TypeError, IndexError):\n        thumbnail = None\n\n    return {\n        \"title\": html_to_text(\n            data.get('titleProps', {}).get('content')\n            # ss_kv variant 1 & 2\n            or data.get('title')\n        ),\n        \"url\": data.get('sourceProps', {}).get('dest_url')\n        # ss_kv variant 1\n        or data.get('normal_url')\n        # ss_kv variant 2\n        or data.get('url'),\n        \"content\": html_to_text(\n            data.get('summaryProps', {}).get('content')\n            # ss_doc variant 1\n            or data.get('message', {}).get('replyContent')\n            # ss_kv variant 1\n            or data.get('show_body')\n            # ss_kv variant 2\n            or data.get('desc')\n        ),\n        \"publishedDate\": published_date,\n        \"thumbnail\": thumbnail,\n    }\n\n\ndef parse_ss_note(data):\n    try:\n        published_date = datetime.fromtimestamp(int(data.get('source', {}).get('time')))\n    except (ValueError, TypeError):\n        published_date = None\n\n    return {\n        \"title\": html_to_text(data.get('title', {}).get('content')),\n        \"url\": data.get('source', {}).get('dest_url'),\n        \"content\": html_to_text(data.get('summary', {}).get('content')),\n        \"publishedDate\": published_date,\n    }\n\n\ndef parse_travel_dest_overview(data):\n    return {\n        \"title\": html_to_text(data.get('strong', {}).get('title')),\n        \"url\": data.get('strong', {}).get('baike_url'),\n        \"content\": html_to_text(data.get('strong', {}).get('baike_text')),\n    }\n\n\ndef parse_travel_ranking_list(data):\n    return {\n        \"title\": html_to_text(data.get('title', {}).get('text')),\n        \"url\": data.get('title', {}).get('url'),\n        \"content\": html_to_text(data.get('title', {}).get('title_tag')),\n    }\n"
  },
  {
    "path": "docs/more_powerful_search_skill/extra/wikipedia.py",
    "content": "# SPDX-License-Identifier: AGPL-3.0-or-later\n\"\"\"This module implements the Wikipedia engine.  Some of this implementations\nare shared by other engines:\n\n- :ref:`wikidata engine`\n\nThe list of supported languages is :py:obj:`fetched <fetch_wikimedia_traits>` from\nthe article linked by :py:obj:`list_of_wikipedias`.\n\nUnlike traditional search engines, wikipedia does not support one Wikipedia for\nall languages, but there is one Wikipedia for each supported language. Some of\nthese Wikipedias have a LanguageConverter_ enabled\n(:py:obj:`rest_v1_summary_url`).\n\nA LanguageConverter_ (LC) is a system based on language variants that\nautomatically converts the content of a page into a different variant. A variant\nis mostly the same language in a different script.\n\n- `Wikipedias in multiple writing systems`_\n- `Automatic conversion between traditional and simplified Chinese characters`_\n\nPR-2554_:\n  The Wikipedia link returned by the API is still the same in all cases\n  (`https://zh.wikipedia.org/wiki/出租車`_) but if your browser's\n  ``Accept-Language`` is set to any of ``zh``, ``zh-CN``, ``zh-TW``, ``zh-HK``\n  or .. Wikipedia's LC automatically returns the desired script in their\n  web-page.\n\n  - You can test the API here: https://reqbin.com/gesg2kvx\n\n.. _https://zh.wikipedia.org/wiki/出租車:\n   https://zh.wikipedia.org/wiki/%E5%87%BA%E7%A7%9F%E8%BB%8A\n\nTo support Wikipedia's LanguageConverter_, a SearXNG request to Wikipedia uses\n:py:obj:`get_wiki_params` and :py:obj:`wiki_lc_locale_variants' in the\n:py:obj:`fetch_wikimedia_traits` function.\n\nTo test in SearXNG, query for ``!wp 出租車`` with each of the available Chinese\noptions:\n\n- ``!wp 出租車 :zh``    should show 出租車\n- ``!wp 出租車 :zh-CN`` should show 出租车\n- ``!wp 出租車 :zh-TW`` should show 計程車\n- ``!wp 出租車 :zh-HK`` should show 的士\n- ``!wp 出租車 :zh-SG`` should show 德士\n\n.. _LanguageConverter:\n   https://www.mediawiki.org/wiki/Writing_systems#LanguageConverter\n.. _Wikipedias in multiple writing systems:\n   https://meta.wikimedia.org/wiki/Wikipedias_in_multiple_writing_systems\n.. _Automatic conversion between traditional and simplified Chinese characters:\n   https://en.wikipedia.org/wiki/Chinese_Wikipedia#Automatic_conversion_between_traditional_and_simplified_Chinese_characters\n.. _PR-2554: https://github.com/searx/searx/pull/2554\n\n\"\"\"\n\nimport urllib.parse\n\nimport babel\nfrom lxml import html\n\nfrom searx import locales, utils\nfrom searx import network as _network\nfrom searx.enginelib.traits import EngineTraits\n\n# about\nabout = {\n    \"website\": \"https://www.wikipedia.org/\",\n    \"wikidata_id\": \"Q52\",\n    \"official_api_documentation\": \"https://en.wikipedia.org/api/\",\n    \"use_official_api\": True,\n    \"require_api_key\": False,\n    \"results\": \"JSON\",\n}\n\ndisplay_type = [\"infobox\"]\n\"\"\"A list of display types composed from ``infobox`` and ``list``.  The latter\none will add a hit to the result list.  The first one will show a hit in the\ninfo box.  Both values can be set, or one of the two can be set.\"\"\"\n\nlist_of_wikipedias = \"https://meta.wikimedia.org/wiki/List_of_Wikipedias\"\n\"\"\"`List of all wikipedias <https://meta.wikimedia.org/wiki/List_of_Wikipedias>`_\n\"\"\"\n\nwikipedia_article_depth = \"https://meta.wikimedia.org/wiki/Wikipedia_article_depth\"\n\"\"\"The *editing depth* of Wikipedia is one of several possible rough indicators\nof the encyclopedia's collaborative quality, showing how frequently its articles\nare updated.  The measurement of depth was introduced after some limitations of\nthe classic measurement of article count were realized.\n\"\"\"\n\nrest_v1_summary_url = \"https://{wiki_netloc}/api/rest_v1/page/summary/{title}\"\n\"\"\"\n`wikipedia rest_v1 summary API`_:\n  The summary response includes an extract of the first paragraph of the page in\n  plain text and HTML as well as the type of page. This is useful for page\n  previews (fka. Hovercards, aka. Popups) on the web and link previews in the\n  apps.\n\nHTTP ``Accept-Language`` header (``send_accept_language_header``):\n  The desired language variant code for wikis where LanguageConverter_ is\n  enabled.\n\n.. _wikipedia rest_v1 summary API:\n   https://en.wikipedia.org/api/rest_v1/#/Page%20content/get_page_summary__title_\n\n\"\"\"\n\nwiki_lc_locale_variants = {\n    \"zh\": (\n        \"zh-CN\",\n        \"zh-HK\",\n        \"zh-MO\",\n        \"zh-MY\",\n        \"zh-SG\",\n        \"zh-TW\",\n    ),\n    \"zh-classical\": (\"zh-classical\",),\n}\n\"\"\"Mapping rule of the LanguageConverter_ to map a language and its variants to\na Locale (used in the HTTP ``Accept-Language`` header). For example see `LC\nChinese`_.\n\n.. _LC Chinese:\n   https://meta.wikimedia.org/wiki/Wikipedias_in_multiple_writing_systems#Chinese\n\"\"\"\n\nwikipedia_script_variants = {\n    \"zh\": (\n        \"zh_Hant\",\n        \"zh_Hans\",\n    )\n}\n\n\ndef get_wiki_params(sxng_locale, eng_traits):\n    \"\"\"Returns the Wikipedia language tag and the netloc that fits to the\n    ``sxng_locale``.  To support LanguageConverter_ this function rates a locale\n    (region) higher than a language (compare :py:obj:`wiki_lc_locale_variants`).\n\n    \"\"\"\n    eng_tag = eng_traits.get_region(sxng_locale, eng_traits.get_language(sxng_locale, \"en\"))\n    wiki_netloc = eng_traits.custom[\"wiki_netloc\"].get(eng_tag, \"en.wikipedia.org\")\n    return eng_tag, wiki_netloc\n\n\ndef request(query, params):\n    \"\"\"Assemble a request (`wikipedia rest_v1 summary API`_).\"\"\"\n    if query.islower():\n        query = query.title()\n\n    _eng_tag, wiki_netloc = get_wiki_params(params[\"searxng_locale\"], traits)\n    title = urllib.parse.quote(query)\n    params[\"url\"] = rest_v1_summary_url.format(wiki_netloc=wiki_netloc, title=title)\n\n    params[\"raise_for_httperror\"] = False\n    params[\"soft_max_redirects\"] = 2\n\n    return params\n\n\n# get response from search-request\ndef response(resp):\n\n    results = []\n    if resp.status_code == 404:\n        return []\n    if resp.status_code == 400:\n        try:\n            api_result = resp.json()\n        except Exception:  # pylint: disable=broad-except\n            pass\n        else:\n            if (\n                api_result[\"type\"] == \"https://mediawiki.org/wiki/HyperSwitch/errors/bad_request\"\n                and api_result[\"detail\"] == \"title-invalid-characters\"\n            ):\n                return []\n\n    _network.raise_for_httperror(resp)\n\n    api_result = resp.json()\n    title = utils.html_to_text(api_result.get(\"titles\", {}).get(\"display\") or api_result.get(\"title\"))\n    wikipedia_link = api_result[\"content_urls\"][\"desktop\"][\"page\"]\n\n    if \"list\" in display_type or api_result.get(\"type\") != \"standard\":\n        # show item in the result list if 'list' is in the display options or it\n        # is a item that can't be displayed in a infobox.\n        results.append(\n            {\n                \"url\": wikipedia_link,\n                \"title\": title,\n                \"content\": api_result.get(\"description\", \"\"),\n            }\n        )\n\n    if \"infobox\" in display_type:\n        if api_result.get(\"type\") == \"standard\":\n            results.append(\n                {\n                    \"infobox\": title,\n                    \"id\": wikipedia_link,\n                    \"content\": api_result.get(\"extract\", \"\"),\n                    \"img_src\": api_result.get(\"thumbnail\", {}).get(\"source\"),\n                    \"urls\": [{\"title\": \"Wikipedia\", \"url\": wikipedia_link}],\n                }\n            )\n\n    return results\n\n\n# Nonstandard language codes\n#\n# These Wikipedias use language codes that do not conform to the ISO 639\n# standard (which is how wiki subdomains are chosen nowadays).\n\nlang_map = locales.LOCALE_BEST_MATCH.copy()\nlang_map.update(\n    {\n        \"be-tarask\": \"bel\",\n        \"ak\": \"aka\",\n        \"als\": \"gsw\",\n        \"bat-smg\": \"sgs\",\n        \"cbk-zam\": \"cbk\",\n        \"fiu-vro\": \"vro\",\n        \"map-bms\": \"map\",\n        \"no\": \"nb-NO\",\n        \"nrm\": \"nrf\",\n        \"roa-rup\": \"rup\",\n        \"nds-nl\": \"nds\",\n        #'simple: – invented code used for the Simple English Wikipedia (not the official IETF code en-simple)\n        \"zh-min-nan\": \"nan\",\n        \"zh-yue\": \"yue\",\n        \"an\": \"arg\",\n    }\n)\n\n\ndef fetch_traits(engine_traits: EngineTraits):\n    fetch_wikimedia_traits(engine_traits)\n    print(\"WIKIPEDIA_LANGUAGES: %s\" % len(engine_traits.custom[\"WIKIPEDIA_LANGUAGES\"]))\n\n\ndef fetch_wikimedia_traits(engine_traits: EngineTraits):\n    \"\"\"Fetch languages from Wikipedia.  Not all languages from the\n    :py:obj:`list_of_wikipedias` are supported by SearXNG locales, only those\n    known from :py:obj:`searx.locales.LOCALE_NAMES` or those with a minimal\n    :py:obj:`editing depth <wikipedia_article_depth>`.\n\n    The location of the Wikipedia address of a language is mapped in a\n    :py:obj:`custom field <searx.enginelib.traits.EngineTraits.custom>`\n    (``wiki_netloc``).  Here is a reduced example:\n\n    .. code:: python\n\n       traits.custom['wiki_netloc'] = {\n           \"en\": \"en.wikipedia.org\",\n           ..\n           \"gsw\": \"als.wikipedia.org\",\n           ..\n           \"zh\": \"zh.wikipedia.org\",\n           \"zh-classical\": \"zh-classical.wikipedia.org\"\n       }\n    \"\"\"\n    # pylint: disable=import-outside-toplevel, too-many-branches\n\n    from searx.network import get  # see https://github.com/searxng/searxng/issues/762\n    from searx.utils import searxng_useragent\n\n    engine_traits.custom[\"wiki_netloc\"] = {}\n    engine_traits.custom[\"WIKIPEDIA_LANGUAGES\"] = []\n\n    # insert alias to map from a script or region to a wikipedia variant\n\n    for eng_tag, sxng_tag_list in wikipedia_script_variants.items():\n        for sxng_tag in sxng_tag_list:\n            engine_traits.languages[sxng_tag] = eng_tag\n    for eng_tag, sxng_tag_list in wiki_lc_locale_variants.items():\n        for sxng_tag in sxng_tag_list:\n            engine_traits.regions[sxng_tag] = eng_tag\n\n    headers = {\"Accept\": \"*/*\", \"User-Agent\": searxng_useragent()}\n    resp = get(list_of_wikipedias, timeout=5, headers=headers)\n    if not resp.ok:\n        raise RuntimeError(\"Response from Wikipedia is not OK.\")\n\n    dom = html.fromstring(resp.text)\n    for row in dom.xpath('//table[contains(@class,\"sortable\")]//tbody/tr'):\n        cols = row.xpath(\"./td\")\n        if not cols:\n            continue\n        cols = [c.text_content().strip() for c in cols]\n\n        depth = float(cols[11].replace(\"-\", \"0\").replace(\",\", \"\"))\n        articles = int(cols[4].replace(\",\", \"\").replace(\",\", \"\"))\n\n        eng_tag = cols[3]\n        wiki_url = row.xpath(\"./td[4]/a/@href\")[0]\n        wiki_url = urllib.parse.urlparse(wiki_url)\n\n        try:\n            sxng_tag = locales.language_tag(babel.Locale.parse(lang_map.get(eng_tag, eng_tag), sep=\"-\"))\n        except babel.UnknownLocaleError:\n            # print(\"ERROR: %s [%s] is unknown by babel\" % (cols[0], eng_tag))\n            continue\n        finally:\n            engine_traits.custom[\"WIKIPEDIA_LANGUAGES\"].append(eng_tag)\n\n        if sxng_tag not in locales.LOCALE_NAMES:\n            if articles < 10000:\n                # exclude languages with too few articles\n                continue\n\n            if int(depth) < 20:\n                # Rough indicator of a Wikipedia’s quality, showing how\n                # frequently its articles are updated.\n                continue\n\n        conflict = engine_traits.languages.get(sxng_tag)\n        if conflict:\n            if conflict != eng_tag:\n                print(\"CONFLICT: babel %s --> %s, %s\" % (sxng_tag, conflict, eng_tag))\n            continue\n\n        engine_traits.languages[sxng_tag] = eng_tag\n        engine_traits.custom[\"wiki_netloc\"][eng_tag] = wiki_url.netloc\n\n    engine_traits.custom[\"WIKIPEDIA_LANGUAGES\"].sort()\n"
  },
  {
    "path": "docs/more_powerful_search_skill/extra/youtube_noapi.py",
    "content": "# SPDX-License-Identifier: AGPL-3.0-or-later\n\"\"\"Youtube (Videos)\"\"\"\n\nfrom functools import reduce\nfrom json import loads, dumps\nfrom urllib.parse import quote_plus\n\nfrom searx.utils import extr\n\n# about\nabout = {\n    \"website\": 'https://www.youtube.com/',\n    \"wikidata_id\": 'Q866',\n    \"official_api_documentation\": 'https://developers.google.com/youtube/v3/docs/search/list?apix=true',\n    \"use_official_api\": False,\n    \"require_api_key\": False,\n    \"results\": 'HTML',\n}\n\n# engine dependent config\ncategories = ['videos', 'music']\npaging = True\nlanguage_support = False\ntime_range_support = True\n\n# search-url\nbase_url = 'https://www.youtube.com/results'\nsearch_url = base_url + '?search_query={query}&page={page}'\ntime_range_url = '&sp=EgII{time_range}%253D%253D'\n# the key seems to be constant\nnext_page_url = f'https://www.youtube.com/youtubei/v1/search?key={key}'\ntime_range_dict = {'day': 'Ag', 'week': 'Aw', 'month': 'BA', 'year': 'BQ'}\n\nbase_youtube_url = 'https://www.youtube.com/watch?v='\n\n\n# do search-request\ndef request(query, params):\n    params['cookies']['CONSENT'] = \"YES+\"\n    if not params['engine_data'].get('next_page_token'):\n        params['url'] = search_url.format(query=quote_plus(query), page=params['pageno'])\n        if params['time_range'] in time_range_dict:\n            params['url'] += time_range_url.format(time_range=time_range_dict[params['time_range']])\n    else:\n        params['url'] = next_page_url\n        params['method'] = 'POST'\n        params['data'] = dumps(\n            {\n                'context': {\"client\": {\"clientName\": \"WEB\", \"clientVersion\": \"2.20210310.12.01\"}},\n                'continuation': params['engine_data']['next_page_token'],\n            }\n        )\n        params['headers']['Content-Type'] = 'application/json'\n\n    return params\n\n\n# get response from search-request\ndef response(resp):\n    if resp.search_params.get('engine_data'):\n        return parse_next_page_response(resp.text)\n    return parse_first_page_response(resp.text)\n\n\ndef parse_next_page_response(response_text):\n    results = []\n    result_json = loads(response_text)\n    for section in (\n        result_json['onResponseReceivedCommands'][0]\n        .get('appendContinuationItemsAction')['continuationItems'][0]\n        .get('itemSectionRenderer')['contents']\n    ):\n        if 'videoRenderer' not in section:\n            continue\n        section = section['videoRenderer']\n        content = \"-\"\n        if 'descriptionSnippet' in section:\n            content = ' '.join(x['text'] for x in section['descriptionSnippet']['runs'])\n        results.append(\n            {\n                'url': base_youtube_url + section['videoId'],\n                'title': ' '.join(x['text'] for x in section['title']['runs']),\n                'content': content,\n                'author': section['ownerText']['runs'][0]['text'],\n                'length': section['lengthText']['simpleText'],\n                'template': 'videos.html',\n                'iframe_src': 'https://www.youtube-nocookie.com/embed/' + section['videoId'],\n                'thumbnail': section['thumbnail']['thumbnails'][-1]['url'],\n            }\n        )\n    try:\n        token = (\n            result_json['onResponseReceivedCommands'][0]\n            .get('appendContinuationItemsAction')['continuationItems'][1]\n            .get('continuationItemRenderer')['continuationEndpoint']\n            .get('continuationCommand')['token']\n        )\n        results.append(\n            {\n                \"engine_data\": token,\n                \"key\": \"next_page_token\",\n            }\n        )\n    except:  # pylint: disable=bare-except\n        pass\n\n    return results\n\n\ndef parse_first_page_response(response_text):\n    results = []\n    results_data = extr(response_text, 'ytInitialData = ', ';</script>')\n\n    results_json = loads(results_data) if results_data else {}\n    sections = (\n        results_json.get('contents', {})\n        .get('twoColumnSearchResultsRenderer', {})\n        .get('primaryContents', {})\n        .get('sectionListRenderer', {})\n        .get('contents', [])\n    )\n\n    for section in sections:\n        if \"continuationItemRenderer\" in section:\n            next_page_token = (\n                section[\"continuationItemRenderer\"]\n                .get(\"continuationEndpoint\", {})\n                .get(\"continuationCommand\", {})\n                .get(\"token\", \"\")\n            )\n            if next_page_token:\n                results.append(\n                    {\n                        \"engine_data\": next_page_token,\n                        \"key\": \"next_page_token\",\n                    }\n                )\n        for video_container in section.get('itemSectionRenderer', {}).get('contents', []):\n            video = video_container.get('videoRenderer', {})\n            videoid = video.get('videoId')\n            if videoid is not None:\n                url = base_youtube_url + videoid\n                thumbnail = 'https://i.ytimg.com/vi/' + videoid + '/hqdefault.jpg'\n                title = get_text_from_json(video.get('title', {}))\n                content = get_text_from_json(video.get('descriptionSnippet', {}))\n                author = get_text_from_json(video.get('ownerText', {}))\n                length = get_text_from_json(video.get('lengthText', {}))\n\n                # append result\n                results.append(\n                    {\n                        'url': url,\n                        'title': title,\n                        'content': content,\n                        'author': author,\n                        'length': length,\n                        'template': 'videos.html',\n                        'iframe_src': 'https://www.youtube-nocookie.com/embed/' + videoid,\n                        'thumbnail': thumbnail,\n                    }\n                )\n\n    # return results\n    return results\n\n\ndef get_text_from_json(element):\n    if 'runs' in element:\n        return reduce(lambda a, b: a + b.get('text', ''), element.get('runs'), '')\n    return element.get('simpleText', '')\n"
  },
  {
    "path": "docs/more_powerful_search_skill/rss_parsor.py",
    "content": "import httpx\nimport feedparser\nfrom reference.async_logger import wis_logger\nfrom reference.wis import CrawlResult, SqliteCache\nfrom typing import List, Tuple\nfrom reference.wis.ws_connect import notify_user\nfrom reference.tools.general_utils import normalize_publish_date\nimport asyncio\n\n\nasync def fetch_rss(url, existings: set=set(), cache_manager: SqliteCache = None) -> Tuple[List[CrawlResult], str, dict]:\n    entries = None\n    if cache_manager:\n        entries = await cache_manager.get(url, namespace='rss')\n        if entries == '**empty**':\n            return [], '', {}\n        \n    if not entries:\n        max_retries = 3\n        base_delay = 10  # seconds\n        for attempt in range(max_retries):\n            try:\n                async with httpx.AsyncClient(timeout=30) as client:\n                    response = await client.get(url)\n                    response.raise_for_status()\n                content = response.content  # bytes\n                break\n            except Exception as e:\n                if attempt < max_retries - 1:\n                    delay = base_delay * (2 ** attempt)\n                    wis_logger.debug(f\"fetching RSS from {url} attempt {attempt + 1} failed with error: {str(e)}, retrying in {delay} seconds\")\n                    await asyncio.sleep(delay)\n                else:\n                    wis_logger.warning(f\"fetching RSS from {url} failed after {max_retries} attempts with error: {str(e)}\")\n                    await notify_user(15, [url])\n                    return [], '', {}\n        \n        parsed = feedparser.parse(content)\n        if parsed.get(\"bozo\", False):\n            wis_logger.warning(f\"Error parsing RSS from {url}: {parsed.get('bozo_exception', '')}\")\n            raise RuntimeError(f\"RSS from {url}: {parsed.get('bozo_exception', '')}\")\n        \n        entries = parsed.entries\n        if cache_manager:\n            await cache_manager.set(url, entries, 60*24, namespace='rss')\n    \n    results = []\n    markdown = ''\n    link_dict = {}\n    for entry in entries:\n        html_parts = []\n        description = ''\n        article_url = entry.get('link', url)\n        if article_url in existings:\n            continue\n        # 1. 如果 entry 有 content 字段，遍历每个 content_item\n        if 'content' in entry and entry['content']:\n            for content_item in entry['content']:\n                t = content_item.get('type', '').lower()\n                if t.startswith('text/') or t == 'application/xhtml+xml':\n                    # 尝试多种字段名\n                    for key in ['value', 'body', 'content']:\n                        if key in content_item:\n                            html_parts.append(content_item[key])\n                            break\n        # 2. 如果没有 content 字段，尝试 summary 或 description\n        if not html_parts:\n            summary = entry.get('summary', '')\n            description = entry.get('description', '')\n            if len(summary) > len(description):\n                description = summary\n            if len(description) > 50:\n                html_parts.append(description)\n                description = ''\n\n        if not html_parts and not description:\n            wis_logger.debug(f\"No content or summary or description found for {article_url} from rss: {url}\")\n            continue\n        # 4. 拼接所有内容为一个整体 html\n        author = entry.get('author', '')\n        title = entry.get('title', '')\n        publish_date = normalize_publish_date(entry.get('published', '')) or entry.get('published', '')\n        if html_parts:\n            html = '\\n\\n'.join(html_parts)\n            results.append(CrawlResult(\n                url=article_url,\n                html=html,\n                title=title,\n                author=author,\n                publish_date=publish_date,\n            ))\n            # existings.add(article_url)  will add when llm extracting finished\n        elif description and article_url != url:\n            key = f\"[{len(link_dict)+1}]\"\n            link_dict[key] = article_url\n            markdown += f\"* {key}{description} (Author: {author} Publish Date: {publish_date}) {key}\\n\"\n            # existings.add(article_url)  will add when llm extracting finished\n    return results, markdown, link_dict\n"
  },
  {
    "path": "docs/prompt_videos.md",
    "content": "\n\nhttps://github.com/user-attachments/assets/8d097b3b-f9ab-42eb-98bb-88af5d28b089\n\n"
  },
  {
    "path": "openclaw.version",
    "content": "# OpenClaw 上游版本锁定\n# 所有 addon 开发者和 CI 均从此文件读取，保证基于同一版本开发和测试\n# 格式遵循 openclaw-for-business/openclaw.version 规范\n#\n# 使用方式（shell）：\n#   source openclaw.version\n#   git clone https://github.com/openclaw/openclaw openclaw\n#   git -C openclaw checkout $OPENCLAW_COMMIT\n#\nOPENCLAW_VERSION=2026.3.13\nOPENCLAW_COMMIT=61d171ab0b2fe4abc9afe89c518586274b4b76c2\n"
  },
  {
    "path": "scripts/generate-patch.sh",
    "content": "#!/bin/bash\nset -e\n\ncd \"$(dirname \"$0\")/..\"\n\nif [ -z \"$1\" ]; then\n  echo \"Usage: ./scripts/generate-patch.sh <patch-name>\"\n  exit 1\nfi\n\nPATCH_NAME=\"$1\"\nPATCHES_DIR=\"patches\"\nmkdir -p \"$PATCHES_DIR\"\n\ncd openclaw\n\n# 获取下一个补丁编号\nLAST_NUM=$(ls ../$PATCHES_DIR/*.patch 2>/dev/null | sed 's/.*\\/\\([0-9]*\\)-.*/\\1/' | sort -n | tail -1)\nNEXT_NUM=$(printf \"%03d\" $((10#${LAST_NUM:-0} + 1)))\n\nPATCH_FILE=\"../$PATCHES_DIR/${NEXT_NUM}-${PATCH_NAME}.patch\"\n\necho \"📝 Generating patch: $(basename \"$PATCH_FILE\")\"\n\ngit diff -- . ':(exclude)pnpm-lock.yaml' > \"$PATCH_FILE\"\n\nif [ ! -s \"$PATCH_FILE\" ]; then\n  echo \"⚠️  No changes detected\"\n  rm \"$PATCH_FILE\"\n  exit 1\nfi\n\necho \"✅ Patch generated: $PATCH_FILE\"\n"
  },
  {
    "path": "tests/README.md",
    "content": "以下测试均要求：\n\n- 在 test 下创建一个 md 文件，记录每次的测试结果（success/fail)，并最终整理为一个列表；\n- 每次测试获取的快照需要额外保存为本地文件，备查。\n\n\n# 企业官网与政府网站\n\n分别打开如下页面，**每批并发 6 个页面**，看是否可以正常打开页面，不触发反检测，并可以获取有效快照内容\n\n## 测试用例\n\n```\nhttps://www.komatsu.com/en-us\nhttps://www.putzmeister.com/web/european-union\nhttps://www.liebherr.com/en-hk/group/start-page-5221008\nhttps://www.cat.com/global-selector.html\nhttps://www.bing.com/search?q=%E5%8D%8E%E5%B0%94%E8%A1%97%E6%97%A5%E6%8A%A5\nhttps://www.wsj.com/\nhttps://www.bloomberg.com\nhttps://www.justice.gov/\nhttp://www.china-cer.com.cn/policy_base/\nhttps://zjw.sh.gov.cn/zwgk/index.html#tab2-a\nhttps://fgj.sh.gov.cn/gfxwj/index.html\nhttps://rsj.sh.gov.cn/tgwgfx_17726/index.html\nhttps://ybj.sh.gov.cn/dybz3/index.html\nhttps://www.mohurd.gov.cn/gongkai/fdzdgknr/zgzygwywj/index.html\nhttps://www.shanghai.gov.cn/nw39221/index.html\nhttps://www.shanghai.gov.cn/nw39220/index.html\nhttps://www.shanghai.gov.cn/nw11408/index.html\nhttps://www.shanghai.gov.cn/nw11407/index.html\nhttps://www.shanghai.gov.cn/nw2407/index.html\nhttps://www.shanghai.gov.cn/nw42850/index.html\nhttps://www.shanghai.gov.cn/nw42944/index.html\nhttps://www.gov.cn/zhengce/index.htm\n```\n\n# 搜索引擎压力测试\n\n打开搜索页并做压力测试，**每批并发 6 个页面**，看是否能够正常返回并成功获取快照内容\n\n# 网站预登录功能模拟测试\n\n打开 https://www.wsj.com/opinion/donald-trump-tariffs-ieepa-supreme-court-john-roberts-opinion-e2610d81\n\n能够侦测到页面存在登录元素，提醒用户完成登录，之后再次访问该页面，能够正常访问，并成功获取内容（正文长度出现变化，大于第一次获取的）\n\n# 社交媒体获取\n\n## 1. 非登录指定主页获取\n\n1.1 打开下面第一个站点，看是否触发了反侦测，以及是否可以获取页面内容。（这些内容都是可以不登录进行获取的）\n\n1.2 在打开的页面里面随便选一个帖子，点进入，看是否可以获取详情，包括评论，如果触发了登录验证，则提示用户完成登录，之后再次访问该页面，能够正常访问，并成功获取内容（正文长度出现变化，大于第一次获取的）\n\n1.3 重复以上步骤逐个测试每个站点\n\n```\nhttps://x.com/valormental\nhttps://www.facebook.com/andrea.sow.31\nhttps://www.linkedin.com/in/baoqiangliu/\nhttps://www.instagram.com/elisameliani/\nhttps://space.bilibili.com/3546603057056627\nhttps://mp.weixin.qq.com/s/Duij3Z2vrImLuOzanqbgbA\nhttps://www.douyin.com/user/MS4wLjABAAAAXUpP_zAelVixv3zv_sWINae86Dt0FMPRZyuozH8MmhbBjvgoDg_xq3Lqnwlacelc\nhttps://www.kuaishou.com/profile/3xvwve5yerjsvvg\nhttps://m.weibo.cn/profile/2194035935\nhttps://www.xiaohongshu.com/user/profile/5f035b1c0000000001002389?xsec_token=ABti9cMRn3S9ARpTWxqiy-5oHI9_QXq50-5qjiSm8emMk=&xsec_source=pc_feed\nhttps://www.zhihu.com/people/lingzezhao\nhttps://discord.com/servers/midjourney-662267976984297473\n```\n\n## 2. 搜索\n\n2.1 依次打开如下站点，看是否触发了反侦测，以及是否可以获取页面内容，如果检测到需要登录，则提示用户完成登录，之后再次访问该页面，能够正常访问，并成功获取内容（正文长度出现变化，大于第一次获取的）；\n\n2.2 在打开的页面里面随便选一个帖子，点进入，看是否可以获取详情，包括评论，如果触发了登录验证，则提示用户完成登录，之后再次访问该页面，能够正常访问，并成功获取内容（正文长度出现变化，大于第一次获取的）\n\n2.3 重复以上步骤逐个测试每个站点\n\n```\nhttps://x.com/search?q=OpenClaw\nhttps://www.facebook.com/search/posts/?q=OpenClaw\nhttps://www.linkedin.com/search/results/all/?keywords=openclaw\nhttps://www.instagram.com/explore/tags/OpenClaw/\nhttps://search.bilibili.com/all?keyword=openclaw\nhttps://www.douyin.com/search/OpenClaw?type=user\nhttps://www.douyin.com/search/OpenClaw?type=video\nhttps://www.kuaishou.com/search/video?searchKey=openclaw\nhttps://m.weibo.cn/search?containerid=100103type%3D1%26q%3Dopenclaw\nhttps://www.xiaohongshu.com/search_result?keyword=openclaw\nhttps://www.zhihu.com/search?q=openclaw&type=content\nhttps://www.zhihu.com/search?q=openclaw&type=people\nhttps://www.zhihu.com/search?q=openclaw&type=zvideo\n```\n\n## 输出结果\n\n每次运行会生成目录：`browser_test/results/<runId>/`\n\n- `report.md`：汇总报告（success/fail/blocked）\n- `cases/<caseId>.json`：单用例详情（耗时、错误、登录校验结果）\n- `snapshots/<caseId>.txt`：登录前 AI 快照\n- `snapshots/<caseId>_after.txt`：登录后 AI 快照（仅登录流程触发时）\n\n---\n\n# Managed Browser 自动化测试（推荐）\n\n脚本路径：`browser_test/run-managed-tests.mjs`\n\n使用 **openclaw-managed browser**，通过以下四个 CLI 命令驱动测试，无需 Chrome 扩展：\n\n```\nopenclaw browser --browser-profile openclaw status\nopenclaw browser --browser-profile openclaw start\nopenclaw browser --browser-profile openclaw open <url>\nopenclaw browser --browser-profile openclaw snapshot\n```\n\n## 运行命令\n\n```bash\n# 快速验证（8 个代表性用例，~5 分钟）\nnode browser_test/run-managed-tests.mjs --mode smoke\n\n# 完整测试（企业/政府/搜索/新闻，含预登录交互，~25 分钟）\nnode browser_test/run-managed-tests.mjs --mode full\n\n# 全量测试（包含社交媒体，含多处交互登录提示）\nnode browser_test/run-managed-tests.mjs --mode social\n```\n\n## 可选参数\n\n| 参数 | 说明 | 默认值 |\n|------|------|-------|\n| `--mode <smoke\\|full\\|social>` | 测试范围 | `smoke` |\n| `--profile <name>` | browser profile | `openclaw` |\n| `--stabilizeMs <ms>` | 打开页面后等待稳定的时间 | `4000` |\n| `--timeoutMs <ms>` | 单条命令超时 | `60000` |\n| `--outputDir <dir>` | 结果根目录 | `browser_test/results` |\n\n## 运行前准备\n\n先启动 openclaw gateway（保持运行）：\n\n```bash\n./scripts/dev.sh gateway\n```\n\n无需 Chrome 扩展，managed browser 由 openclaw 自动管理。\n\n## 输出结构\n\n```\nbrowser_test/results/<RUN_ID>_managed/\n  report.md              汇总报告（success/blocked/partial/error）\n  cases/<id>.json        单用例详情（耗时、快照字节数、内容分析结果）\n  snapshots/<id>.txt     页面 AI 快照（snapshot --format ai --mode efficient）\n  snapshots/<id>_after.txt 登录后 AI 快照（仅预登录测试触发时）\n```\n\n所有判断（是否触发反爬、是否存在登录墙、内容是否充足）均基于提取到的 AI 快照文本。\n\n## 并发策略\n\n- 企业官网、政府网站、搜索引擎：每批并发 6 个页面\n- 新闻媒体、预登录、社交媒体：串行逐个测试（避免登录/状态互相干扰）\n\n## 登录测试行为\n\n`full` / `social` 模式下，遇到预登录测试用例（如 WSJ 文章）：\n\n1. 先抓取一次内容（before）；\n2. 若检测到登录墙，命令行暂停等待手动登录；\n3. 回车后再次抓取（after）；\n4. 内容增长 > 20% 且 > 200 chars 则判定登录成功。\n"
  },
  {
    "path": "tests/run-managed-tests.mjs",
    "content": "#!/usr/bin/env node\n/**\n * OpenClaw Managed Browser — Automated Test Suite\n *\n * Opens each test URL with the openclaw-managed browser, extracts AI snapshot\n * content via native `snapshot --format ai --mode detailed`, saves it to disk,\n * and analyses the snapshot text for\n * anti-bot / login-wall signals.\n *\n * Commands used:\n *   openclaw browser --browser-profile openclaw status\n *   openclaw browser --browser-profile openclaw start\n *   openclaw browser --browser-profile openclaw open <url>\n *   openclaw browser --browser-profile openclaw snapshot --format ai --mode detailed\n *\n * Usage:\n *   node browser_test/run-managed-tests.mjs [options]\n *\n * Options:\n *   --mode <smoke|full|social>   scope (default: smoke)\n *   --profile <name>             browser profile (default: openclaw)\n *   --stabilizeMs <ms>           wait after open before extracting HTML (default: 4000)\n *   --timeoutMs <ms>             CLI timeout per command (default: 60000)\n *   --outputDir <dir>            results root (default: browser_test/results)\n *\n * Modes:\n *   smoke  — 8 representative cases, no login prompts  (~5 min)\n *   full   — all README section-1/2/3 cases + pre-login interactive  (~25 min)\n *   social — full + social media  (multiple interactive login prompts)\n */\n\nimport { spawnSync }                from 'node:child_process';\nimport { mkdirSync, writeFileSync } from 'node:fs';\nimport { join, dirname, resolve }   from 'node:path';\nimport { fileURLToPath }            from 'node:url';\nimport { createInterface }          from 'node:readline';\n\nconst __dirname    = dirname(fileURLToPath(import.meta.url));\nconst PROJECT_ROOT = resolve(__dirname, '..');\nconst OPENCLAW_DIR = join(PROJECT_ROOT, 'openclaw');\n\n// ── Argument parsing ──────────────────────────────────────────────────────────\nconst argv    = process.argv.slice(2);\nconst getArg  = (k, d) => { const i = argv.indexOf(`--${k}`); return i >= 0 && argv[i+1] !== undefined ? argv[i+1] : d; };\n\nconst MODE         = getArg('mode',        'smoke');\nconst PROFILE      = getArg('profile',     'openclaw');\nconst STABILIZE_MS = parseInt(getArg('stabilizeMs', '4000'),  10);\nconst TIMEOUT_MS   = parseInt(getArg('timeoutMs',   '60000'), 10);\nconst OUTPUT_DIR   = getArg('outputDir',   join(PROJECT_ROOT, 'browser_test', 'results'));\n\n// ── Run directory ─────────────────────────────────────────────────────────────\nconst RUN_ID  = new Date().toISOString().replace(/[:.]/g, '-').slice(0, 19) + '_managed';\nconst RUN_DIR = join(OUTPUT_DIR, RUN_ID);\nconst CASES   = join(RUN_DIR, 'cases');\nconst SNAPSHOTS = join(RUN_DIR, 'snapshots');\n\nmkdirSync(CASES, { recursive: true });\nmkdirSync(SNAPSHOTS,  { recursive: true });\n\n// ── Environment for openclaw processes (uses default ~/.openclaw) ────────────\nconst OC_ENV = { ...process.env };\n\n// ── openclaw CLI wrapper ──────────────────────────────────────────────────────\nfunction oc(args, { ms = TIMEOUT_MS, safe = false } = {}) {\n  const res = spawnSync('pnpm', ['openclaw', ...args], {\n    cwd:       OPENCLAW_DIR,\n    env:       OC_ENV,\n    timeout:   ms,\n    encoding:  'utf8',\n    maxBuffer: 30 * 1024 * 1024,\n  });\n  const out = (res.stdout ?? '').trim();\n  const err = (res.stderr ?? '').trim();\n  if (!safe && res.status !== 0) {\n    throw new Error(err || out || `openclaw exited ${res.status}`);\n  }\n  return { ok: res.status === 0, out, err };\n}\n\n/** Extract first valid JSON object from CLI output (skips logo / header lines). */\nfunction extractJson(text) {\n  if (!text) return null;\n  const raw = text.trim();\n\n  try { return JSON.parse(raw); } catch { /* continue */ }\n\n  // Try parsing from first JSON opener to each possible closing index.\n  const firstObj = raw.indexOf('{');\n  const firstArr = raw.indexOf('[');\n  const starts = [firstObj, firstArr].filter(i => i >= 0).sort((a, b) => a - b);\n  for (const start of starts) {\n    for (let end = raw.length; end > start; end--) {\n      const ch = raw[end - 1];\n      if (ch !== '}' && ch !== ']') continue;\n      const candidate = raw.slice(start, end).trim();\n      try { return JSON.parse(candidate); } catch { /* try shorter tail */ }\n    }\n  }\n\n  // Last fallback: parse any single JSON line (for compact outputs).\n  for (const line of raw.split('\\n').reverse()) {\n    const t = line.trim();\n    if (!t) continue;\n    if (t.startsWith('{') || t.startsWith('[') || t.startsWith('\"')) {\n      try { return JSON.parse(t); } catch { /* try next line */ }\n    }\n  }\n\n  return null;\n}\n\nconst sleep = (ms) => new Promise(r => setTimeout(r, ms));\n\nfunction waitForEnter(msg) {\n  return new Promise(resolve => {\n    const rl = createInterface({ input: process.stdin, output: process.stdout });\n    rl.question(msg, () => { rl.close(); resolve(); });\n  });\n}\n\n// ── Snapshot-based content analysis ───────────────────────────────────────────\n// Pick a best-effort title from an AI snapshot line like:\n// - heading \"Some Title\" [level=1] [ref=e1]\nfunction extractTitle(snapshot) {\n  const lines = String(snapshot ?? '').split('\\n');\n  for (const line of lines) {\n    if (!line.includes('heading')) continue;\n    const m = line.match(/\"([^\"]{1,200})\"/);\n    if (m && m[1]) return m[1].trim();\n  }\n  return '';\n}\n\n// Convert snapshot tree text to plain text for keyword matching.\nfunction snapshotToText(snapshot) {\n  return String(snapshot ?? '')\n    .replace(/\\[[^\\]]+\\]/g, ' ')\n    .replace(/-\\s+/g, ' ')\n    .replace(/\\s+/g, ' ')\n    .trim();\n}\n\nconst BOT_PATTERNS = [\n  /captcha/i,\n  /are you (a )?human/i,\n  /verify you(\\'re| are)/i,\n  /robot check/i,\n  /security check/i,\n  /ddos.{0,20}protection/i,\n  /cloudflare ray id/i,\n  /just a moment/i,\n  /checking your browser/i,\n  /access denied/i,\n  /403 forbidden/i,\n  /request blocked/i,\n  /you('ve| have) been blocked/i,\n  /请完成安全验证/,\n  /人机验证/,\n  /请证明您不是机器人/,\n];\n\nconst LOGIN_PATTERNS = [\n  /sign in to continue/i,\n  /log in to continue/i,\n  /subscribe to (read|continue|access)/i,\n  /create (an? )?account/i,\n  /登录后(才能|可以|方可)/,\n  /请先登录/,\n  /立即登录/,\n  /(注册|登录)享受更多/,\n];\n\nfunction analyzeSnapshot(snapshot) {\n  const title = extractTitle(snapshot);\n  const text  = snapshotToText(snapshot);\n  const lower = text.toLowerCase();\n\n  const botBlocked = BOT_PATTERNS.some(re => re.test(lower));\n  // Login wall: softer signal — only flag if bot not triggered\n  const loginWall  = !botBlocked && LOGIN_PATTERNS.some(re => re.test(text));\n\n  return {\n    title,\n    snapshotChars: String(snapshot ?? '').length,\n    textChars:  text.length,\n    rich:       text.length >= 300,  // meaningful snapshot content\n    botBlocked,\n    loginWall,\n  };\n}\n\n// ── Test case definitions ─────────────────────────────────────────────────────\n// Fields:\n//   id         — unique id used in filenames\n//   sec        — display section\n//   url        — target URL\n//   login      — true: login wall is expected/normal (not a failure)\n//   loginTest  — pause for manual login then re-extract HTML (pre-login scenario)\n//   socialOnly — only run in 'social' mode\nconst ALL_CASES = [\n\n  // ── 企业官网 ──────────────────────────────────────────────────────────────\n  { id: '1.01_komatsu',     sec: '企业官网', url: 'https://www.komatsu.com/en-us' },\n  { id: '1.02_putzmeister', sec: '企业官网', url: 'https://www.putzmeister.com/web/european-union' },\n  { id: '1.03_liebherr',    sec: '企业官网', url: 'https://www.liebherr.com/en-hk/group/start-page-5221008' },\n  { id: '1.04_cat',         sec: '企业官网', url: 'https://www.cat.com/global-selector.html' },\n\n  // ── 新闻媒体 ──────────────────────────────────────────────────────────────\n  { id: '2.01_wsj',         sec: '新闻媒体', url: 'https://www.wsj.com/',          login: true },\n  { id: '2.02_bloomberg',   sec: '新闻媒体', url: 'https://www.bloomberg.com' },\n\n  // ── 政府网站（美国）──────────────────────────────────────────────────────\n  { id: '3.01_justice',     sec: '政府网站', url: 'https://www.justice.gov/' },\n\n  // ── 政府网站（中国）──────────────────────────────────────────────────────\n  { id: '3.02_china_cer',   sec: '政府网站', url: 'http://www.china-cer.com.cn/policy_base/' },\n  { id: '3.03_zjw_sh',      sec: '政府网站', url: 'https://zjw.sh.gov.cn/zwgk/index.html#tab2-a' },\n  { id: '3.04_fgj_sh',      sec: '政府网站', url: 'https://fgj.sh.gov.cn/gfxwj/index.html' },\n  { id: '3.05_rsj_sh',      sec: '政府网站', url: 'https://rsj.sh.gov.cn/tgwgfx_17726/index.html' },\n  { id: '3.06_ybj_sh',      sec: '政府网站', url: 'https://ybj.sh.gov.cn/dybz3/index.html' },\n  { id: '3.07_mohurd',      sec: '政府网站', url: 'https://www.mohurd.gov.cn/gongkai/fdzdgknr/zgzygwywj/index.html' },\n  { id: '3.08_sh_nw39221',  sec: '政府网站', url: 'https://www.shanghai.gov.cn/nw39221/index.html' },\n  { id: '3.09_sh_nw39220',  sec: '政府网站', url: 'https://www.shanghai.gov.cn/nw39220/index.html' },\n  { id: '3.10_sh_nw11408',  sec: '政府网站', url: 'https://www.shanghai.gov.cn/nw11408/index.html' },\n  { id: '3.11_sh_nw11407',  sec: '政府网站', url: 'https://www.shanghai.gov.cn/nw11407/index.html' },\n  { id: '3.12_sh_nw2407',   sec: '政府网站', url: 'https://www.shanghai.gov.cn/nw2407/index.html' },\n  { id: '3.13_sh_nw42850',  sec: '政府网站', url: 'https://www.shanghai.gov.cn/nw42850/index.html' },\n  { id: '3.14_sh_nw42944',  sec: '政府网站', url: 'https://www.shanghai.gov.cn/nw42944/index.html' },\n  { id: '3.15_gov_cn',      sec: '政府网站', url: 'https://www.gov.cn/zhengce/index.htm' },\n\n  // ── 搜索引擎压力测试 ──────────────────────────────────────────────────────\n  { id: '4.01_bing_wsj',    sec: '搜索引擎', url: 'https://www.bing.com/search?q=%E5%8D%8E%E5%B0%94%E8%A1%97%E6%97%A5%E6%8A%A5' },\n  { id: '4.02_bing_cat',    sec: '搜索引擎', url: 'https://www.bing.com/search?q=caterpillar+heavy+equipment' },\n  { id: '4.03_bing_policy', sec: '搜索引擎', url: 'https://www.bing.com/search?q=%E4%B8%8A%E6%B5%B7+%E5%BB%BA%E8%AE%BE%E5%B7%A5%E7%A8%8B+%E6%94%BF%E7%AD%96' },\n\n  // ── 预登录功能测试（full+ 模式，交互）───────────────────────────────────\n  { id: '5.01_wsj_article', sec: '预登录测试', login: true, loginTest: true,\n    url: 'https://www.wsj.com/opinion/donald-trump-tariffs-ieepa-supreme-court-john-roberts-opinion-e2610d81' },\n\n  // ── 社交媒体 — 主页（social 模式）────────────────────────────────────────\n  { id: '6.01_x',           sec: '社交媒体', login: true,  socialOnly: true, url: 'https://x.com/valormental' },\n  { id: '6.02_facebook',    sec: '社交媒体', login: true,  socialOnly: true, url: 'https://www.facebook.com/andrea.sow.31' },\n  { id: '6.03_linkedin',    sec: '社交媒体', login: true,  socialOnly: true, url: 'https://www.linkedin.com/in/baoqiangliu/' },\n  { id: '6.04_instagram',   sec: '社交媒体', login: true,  socialOnly: true, url: 'https://www.instagram.com/elisameliani/' },\n  { id: '6.05_bilibili',    sec: '社交媒体', socialOnly: true, url: 'https://space.bilibili.com/3546603057056627' },\n  { id: '6.06_weixin',      sec: '社交媒体', socialOnly: true, url: 'https://mp.weixin.qq.com/s/Duij3Z2vrImLuOzanqbgbA' },\n  { id: '6.07_douyin',      sec: '社交媒体', login: true,  socialOnly: true, url: 'https://www.douyin.com/user/MS4wLjABAAAAXUpP_zAelVixv3zv_sWINae86Dt0FMPRZyuozH8MmhbBjvgoDg_xq3Lqnwlacelc' },\n  { id: '6.08_kuaishou',    sec: '社交媒体', socialOnly: true, url: 'https://www.kuaishou.com/profile/3xvwve5yerjsvvg' },\n  { id: '6.09_weibo',       sec: '社交媒体', socialOnly: true, url: 'https://m.weibo.cn/profile/2194035935' },\n  { id: '6.10_xiaohongshu', sec: '社交媒体', socialOnly: true, url: 'https://www.xiaohongshu.com/user/profile/5f035b1c0000000001002389?xsec_token=ABti9cMRn3S9ARpTWxqiy-5oHI9_QXq50-5qjiSm8emMk=&xsec_source=pc_feed' },\n  { id: '6.11_zhihu',       sec: '社交媒体', socialOnly: true, url: 'https://www.zhihu.com/people/lingzezhao' },\n  { id: '6.12_discord',     sec: '社交媒体', login: true,  socialOnly: true, url: 'https://discord.com/servers/midjourney-662267976984297473' },\n\n  // ── 社交媒体 — 搜索（social 模式）────────────────────────────────────────\n  { id: '7.01_x_search',        sec: '社交媒体搜索', login: true, socialOnly: true, url: 'https://x.com/search?q=OpenClaw' },\n  { id: '7.02_fb_search',       sec: '社交媒体搜索', login: true, socialOnly: true, url: 'https://www.facebook.com/search/posts/?q=OpenClaw' },\n  { id: '7.03_linkedin_search', sec: '社交媒体搜索', login: true, socialOnly: true, url: 'https://www.linkedin.com/search/results/all/?keywords=openclaw' },\n  { id: '7.04_instagram_tag',   sec: '社交媒体搜索', login: true, socialOnly: true, url: 'https://www.instagram.com/explore/tags/OpenClaw/' },\n  { id: '7.05_bilibili_search', sec: '社交媒体搜索', socialOnly: true, url: 'https://search.bilibili.com/all?keyword=openclaw' },\n  { id: '7.06_douyin_user',     sec: '社交媒体搜索', socialOnly: true, url: 'https://www.douyin.com/search/OpenClaw?type=user' },\n  { id: '7.07_douyin_video',    sec: '社交媒体搜索', socialOnly: true, url: 'https://www.douyin.com/search/OpenClaw?type=video' },\n  { id: '7.08_kuaishou_search', sec: '社交媒体搜索', socialOnly: true, url: 'https://www.kuaishou.com/search/video?searchKey=openclaw' },\n  { id: '7.09_weibo_search',    sec: '社交媒体搜索', socialOnly: true, url: 'https://m.weibo.cn/search?containerid=100103type%3D1%26q%3Dopenclaw' },\n  { id: '7.10_xhs_search',      sec: '社交媒体搜索', socialOnly: true, url: 'https://www.xiaohongshu.com/search_result?keyword=openclaw' },\n  { id: '7.11_zhihu_content',   sec: '社交媒体搜索', socialOnly: true, url: 'https://www.zhihu.com/search?q=openclaw&type=content' },\n  { id: '7.12_zhihu_people',    sec: '社交媒体搜索', socialOnly: true, url: 'https://www.zhihu.com/search?q=openclaw&type=people' },\n  { id: '7.13_zhihu_video',     sec: '社交媒体搜索', socialOnly: true, url: 'https://www.zhihu.com/search?q=openclaw&type=zvideo' },\n];\n\n// Smoke: one or two cases from each key category\nconst SMOKE_IDS = new Set([\n  '1.01_komatsu', '1.03_liebherr',\n  '2.02_bloomberg',\n  '3.01_justice', '3.02_china_cer',\n  '4.01_bing_wsj', '4.02_bing_cat',\n  '3.08_sh_nw39221',\n]);\n\nfunction buildRunList() {\n  switch (MODE) {\n    case 'smoke':  return ALL_CASES.filter(c => SMOKE_IDS.has(c.id));\n    case 'full':   return ALL_CASES.filter(c => !c.socialOnly);\n    case 'social': return ALL_CASES;\n    default:\n      console.error(`Unknown --mode \"${MODE}\". Use: smoke | full | social`);\n      process.exit(1);\n  }\n}\n\nconst RUN_LIST = buildRunList();\nconst PARALLEL_SECTIONS = new Set(['企业官网', '政府网站', '搜索引擎']);\nconst PARALLEL_LIMIT = 6;\n\n// ── Snapshot extraction helper ────────────────────────────────────────────────\nfunction extractSnapshotFromOutput(out) {\n  if (!out) return null;\n  const raw = out.trim();\n  if (!raw) return null;\n\n  // Some builds return JSON; others print snapshot text directly.\n  try {\n    const parsed = JSON.parse(raw);\n    if (typeof parsed === 'string') return extractSnapshotFromOutput(parsed);\n    const candidates = [\n      parsed?.snapshot,\n      parsed?.value,\n      parsed?.result,\n      parsed?.data,\n      parsed?.result?.snapshot,\n      parsed?.result?.value,\n      parsed?.result?.result?.value,\n    ];\n    for (const c of candidates) {\n      if (typeof c === 'string') {\n        const v = extractSnapshotFromOutput(c);\n        if (v) return v;\n      }\n    }\n  } catch {\n    // ignore: non-JSON output\n  }\n\n  const lineJson = extractJson(raw);\n  if (lineJson && typeof lineJson === 'object') {\n    const nested = [\n      lineJson.snapshot,\n      lineJson.value,\n      lineJson.result?.snapshot,\n      lineJson.result?.value,\n    ];\n    for (const c of nested) {\n      if (typeof c === 'string') {\n        const v = extractSnapshotFromOutput(c);\n        if (v) return v;\n      }\n    }\n  }\n\n  // Fallback: keep only the snapshot tree part and drop build/log noise.\n  const lines = raw.split('\\n');\n  const start = lines.findIndex(l => /^\\s*-\\s+/.test(l));\n  if (start >= 0) {\n    return lines.slice(start).join('\\n').trim();\n  }\n\n  return null;\n}\n\nasync function fetchSnapshotAI({ ms = TIMEOUT_MS, targetId = null } = {}) {\n  const args = ['browser', '--browser-profile', PROFILE, 'snapshot', '--format', 'ai', '--mode', 'detailed'];\n  if (targetId) args.push('--target-id', targetId);\n  const cliRes = oc(args, { ms, safe: true });\n  if (!cliRes.ok) return null;\n  return extractSnapshotFromOutput(cliRes.out);\n}\n\nfunction parseOpenedTargetId(text) {\n  const raw = String(text ?? '');\n  const fromJson = extractJson(raw);\n  const maybeJsonId = fromJson?.id ?? fromJson?.targetId ?? fromJson?.result?.id;\n  if (typeof maybeJsonId === 'string' && maybeJsonId.trim()) {\n    return maybeJsonId.trim();\n  }\n  const m = raw.match(/\\bid:\\s*([A-F0-9]{16,})\\b/i);\n  return m?.[1] ?? null;\n}\n\n/**\n * Query the current tab list to obtain a best-effort active tab.\n */\nfunction fetchCurrentTab() {\n  const { out, ok } = oc(\n    ['browser', '--browser-profile', PROFILE, 'tabs', '--json'],\n    { ms: 8000, safe: true },\n  );\n  if (!ok || !out) return null;\n  const json = extractJson(out);\n  const tabs = Array.isArray(json?.tabs) ? json.tabs : [];\n  if (!tabs.length) return null;\n\n  const preferred = tabs.find(t => t?.url && t.url !== 'about:blank' && t?.type === 'page') ?? tabs[0];\n  return {\n    targetId: preferred?.targetId ?? null,\n    wsUrl: preferred?.wsUrl ?? null,\n    url: preferred?.url ?? null,\n    title: preferred?.title ?? null,\n  };\n}\n\n// ── Per-case test runner ──────────────────────────────────────────────────────\nasync function runCase(tc) {\n  const { id, url, sec, login, loginTest } = tc;\n\n  process.stdout.write(`\\n[${''.padEnd(0)}${id}]`.padEnd(28) + ` ${url}\\n`);\n\n  const meta = {\n    id, url, sec,\n    status:          'pending',\n    startedAt:       new Date().toISOString(),\n    elapsedMs:       0,\n    targetId:        null,\n    title:           '',\n    snapshotChars:   0,\n    snapshotAfterChars:  0,\n    loginExpected:   !!login,\n    loginTestRan:    false,\n    loginSuccess:    null,\n    analysis:        {},\n    error:           null,\n  };\n\n  const t0 = Date.now();\n  try {\n    // ── 1. open ───────────────────────────────────────────────────────────────\n    const opened = oc(['browser', '--browser-profile', PROFILE, 'open', url], { ms: 30000, safe: true });\n    meta.targetId = parseOpenedTargetId(opened.out);\n    let currentTab = null;\n    if (!meta.targetId) {\n      // Fallback only when open output does not include id.\n      currentTab = fetchCurrentTab();\n      meta.targetId = currentTab?.targetId ?? null;\n    }\n    process.stdout.write(`     ↳ opened  targetId=${meta.targetId ?? '?'}\\n`);\n\n    // ── 2. wait for page to settle ────────────────────────────────────────────\n    await sleep(STABILIZE_MS);\n    oc(\n      ['browser', '--browser-profile', PROFILE, 'wait',\n       '--load', 'networkidle', '--timeout-ms', '10000'],\n      { ms: 14000, safe: true },\n    );\n\n    // ── 3. extract AI snapshot via native browser tool ────────────────────────\n    const snapshot = await fetchSnapshotAI({\n      ms: 25000,\n      targetId: meta.targetId,\n    });\n    if (!snapshot) {\n      throw new Error('snapshot returned no content — page may not have loaded');\n    }\n\n    meta.snapshotChars = snapshot.length;\n    writeFileSync(join(SNAPSHOTS, `${id}.txt`), snapshot, 'utf8');\n    process.stdout.write(`     ↳ snapshot ${meta.snapshotChars.toLocaleString()} chars\\n`);\n\n    // ── 4. analyse snapshot ───────────────────────────────────────────────────\n    const analysis  = analyzeSnapshot(snapshot);\n    meta.analysis   = analysis;\n    meta.title      = analysis.title;\n\n    // ── 5. pre-login test (interactive) ──────────────────────────────────────\n    if (loginTest && analysis.loginWall) {\n      meta.loginTestRan = true;\n      await waitForEnter(\n        `\\n  🔑 Login wall detected on ${url}\\n` +\n        `     Log in manually in the browser window, then press ENTER to re-extract HTML...\\n  > `,\n      );\n\n      await sleep(STABILIZE_MS);\n      currentTab = fetchCurrentTab() ?? currentTab;\n      const snapshotAfter = await fetchSnapshotAI({\n        ms: 25000,\n        targetId: currentTab?.targetId ?? meta.targetId,\n      });\n      if (snapshotAfter) {\n        meta.snapshotAfterChars = snapshotAfter.length;\n        writeFileSync(join(SNAPSHOTS, `${id}_after.txt`), snapshotAfter, 'utf8');\n        const ratio = snapshotAfter.length / Math.max(snapshot.length, 1);\n        meta.loginSuccess = ratio > 1.2 && snapshotAfter.length > snapshot.length + 500;\n        process.stdout.write(\n          `     ↳ after-login snapshot ${meta.snapshotAfterChars.toLocaleString()} chars` +\n          ` (${ratio.toFixed(2)}x)  login=${meta.loginSuccess ? '✅' : '❌'}\\n`,\n        );\n      }\n    }\n\n    // ── 6. determine status ───────────────────────────────────────────────────\n    if (analysis.botBlocked) {\n      meta.status = 'blocked';\n      process.stdout.write(`     🚫 BLOCKED — bot/captcha signal in HTML\\n`);\n    } else if (!analysis.rich) {\n      meta.status = 'partial';\n      process.stdout.write(`     ⚠️  PARTIAL — snapshot thin (text ${analysis.textChars} chars)\\n`);\n    } else {\n      meta.status = 'success';\n      const note = analysis.loginWall\n        ? ` [login wall${login ? ', expected' : ''}]`\n        : '';\n      process.stdout.write(`     ✅ SUCCESS — title: \"${meta.title}\"${note}\\n`);\n    }\n\n    // ── 7. close tab ──────────────────────────────────────────────────────────\n    const tidToClose = meta.targetId ?? fetchCurrentTab()?.targetId ?? null;\n    if (tidToClose) {\n      oc(['browser', '--browser-profile', PROFILE, 'close', tidToClose],\n         { ms: 5000, safe: true });\n    }\n\n  } catch (err) {\n    meta.status = 'error';\n    meta.error  = String(err?.message ?? err).slice(0, 600);\n    process.stdout.write(`     ❌ ERROR — ${meta.error}\\n`);\n    // Best-effort close even on error\n    const tidOnError = meta.targetId ?? fetchCurrentTab()?.targetId ?? null;\n    if (tidOnError) {\n      oc(['browser', '--browser-profile', PROFILE, 'close', tidOnError],\n         { ms: 5000, safe: true });\n    }\n  }\n\n  meta.elapsedMs = Date.now() - t0;\n  writeFileSync(join(CASES, `${id}.json`), JSON.stringify(meta, null, 2), 'utf8');\n  return meta;\n}\n\n// ── Report generator ──────────────────────────────────────────────────────────\nfunction writeReport(results) {\n  const ICON = { success: '✅', blocked: '🚫', partial: '⚠️', error: '❌' };\n\n  const sections = new Map();\n  for (const r of results) {\n    if (!sections.has(r.sec)) sections.set(r.sec, []);\n    sections.get(r.sec).push(r);\n  }\n\n  let md = `# Test Report — OpenClaw Managed Browser\\n\\n`;\n  md += `**Run ID**: \\`${RUN_ID}\\`  \\n`;\n  md += `**Date**: ${new Date().toISOString()}  \\n`;\n  md += `**Mode**: ${MODE}  |  **Profile**: \\`${PROFILE}\\`  |  **Cases**: ${results.length}  \\n\\n`;\n\n  md += `## Summary\\n\\n`;\n  md += `| Section | Total | ✅ | 🚫 | ⚠️ | ❌ |\\n`;\n  md += `|---------|------:|----:|----:|----:|----:|\\n`;\n\n  let T=0, Ok=0, Bl=0, Pa=0, Er=0;\n  for (const [sec, rows] of sections) {\n    const ok = rows.filter(r => r.status==='success').length;\n    const bl = rows.filter(r => r.status==='blocked').length;\n    const pa = rows.filter(r => r.status==='partial').length;\n    const er = rows.filter(r => r.status==='error'  ).length;\n    md += `| ${sec} | ${rows.length} | ${ok} | ${bl} | ${pa} | ${er} |\\n`;\n    T+=rows.length; Ok+=ok; Bl+=bl; Pa+=pa; Er+=er;\n  }\n  md += `| **合计** | **${T}** | **${Ok}** | **${Bl}** | **${Pa}** | **${Er}** |\\n\\n`;\n\n  md += `## Case Details\\n\\n`;\n  for (const [sec, rows] of sections) {\n    md += `### ${sec}\\n\\n`;\n    for (const r of rows) {\n      const icon = ICON[r.status] ?? '?';\n      md += `#### ${icon} \\`${r.id}\\`\\n\\n`;\n      md += `- **URL**: ${r.url}\\n`;\n      md += `- **Status**: \\`${r.status}\\`  (${(r.elapsedMs/1000).toFixed(1)} s)\\n`;\n      if (r.title)    md += `- **Title**: ${r.title}\\n`;\n      if (r.snapshotChars) {\n        md += `- **Snapshot**: ${r.snapshotChars.toLocaleString()} chars\\n`;\n      }\n      const { botBlocked, loginWall, rich, textChars } = r.analysis ?? {};\n      if (botBlocked)       md += `- 🚫 **Bot / CAPTCHA signal detected**\\n`;\n      if (loginWall)        md += `- 🔑 **Login wall** (${r.loginExpected ? 'expected' : '⚠️ unexpected'})\\n`;\n      if (rich === false)   md += `- ⚠️ **Thin content** — extracted text ${textChars ?? 0} chars\\n`;\n      if (r.loginTestRan) {\n        const badge = r.loginSuccess ? '✅ 内容明显增加，登录成功' : '❌ 内容未见明显变化';\n        md += `- 🔑 **Pre-login test**: ${badge}`;\n        if (r.snapshotAfterChars) md += ` (${r.snapshotChars.toLocaleString()} → ${r.snapshotAfterChars.toLocaleString()} chars)`;\n        md += '\\n';\n      }\n      if (r.error)          md += `- **Error**: \\`${r.error}\\`\\n`;\n      md += '\\n';\n    }\n  }\n\n  writeFileSync(join(RUN_DIR, 'report.md'), md, 'utf8');\n  return { T, Ok, Bl, Pa, Er };\n}\n\n// ── Main ──────────────────────────────────────────────────────────────────────\nasync function main() {\n  const LINE = '═'.repeat(66);\n  console.log(`\\n${LINE}`);\n  console.log(`  OpenClaw Managed Browser — Automated Test Suite`);\n  console.log(`  Mode: ${MODE.toUpperCase()}  |  Profile: ${PROFILE}  |  Cases: ${RUN_LIST.length}`);\n  console.log(`  Run ID: ${RUN_ID}`);\n  console.log(`  Output: ${RUN_DIR}`);\n  console.log(`${LINE}`);\n\n  // ── Phase 1: status ───────────────────────────────────────────────────────\n  console.log('\\n■ [1/4] Checking browser status...');\n  const { out: statusOut } = oc(\n    ['browser', '--browser-profile', PROFILE, 'status', '--json'],\n    { safe: true },\n  );\n  const running = extractJson(statusOut)?.running === true;\n  console.log(`       ${running ? '✓ running' : '✗ stopped'}`);\n\n  // ── Phase 2: start if needed ──────────────────────────────────────────────\n  if (!running) {\n    console.log('\\n■ [2/4] Starting browser...');\n    oc(['browser', '--browser-profile', PROFILE, 'start'], { ms: 30000 });\n    await sleep(3000);\n    console.log('       ✓ browser started');\n  } else {\n    console.log('\\n■ [2/4] Browser already running — skipping start.');\n  }\n\n  // ── Phase 3: test cases ───────────────────────────────────────────────────\n  console.log(`\\n■ [3/4] Running ${RUN_LIST.length} cases...\\n`);\n  const results = [];\n  const parallelCases = RUN_LIST.filter(tc => PARALLEL_SECTIONS.has(tc.sec));\n  const sequentialCases = RUN_LIST.filter(tc => !PARALLEL_SECTIONS.has(tc.sec));\n\n  if (parallelCases.length) {\n    for (let i = 0; i < parallelCases.length; i += PARALLEL_LIMIT) {\n      const chunk = parallelCases.slice(i, i + PARALLEL_LIMIT);\n      console.log(`\\n   ↳ parallel batch (${chunk.length}) for 企业/政府/搜索...`);\n      const chunkResults = await Promise.all(chunk.map(tc => runCase(tc)));\n      results.push(...chunkResults);\n      await sleep(800);\n    }\n  }\n\n  for (const tc of sequentialCases) {\n    results.push(await runCase(tc));\n    await sleep(800);\n  }\n\n  // ── Phase 4: report ───────────────────────────────────────────────────────\n  console.log('\\n■ [4/4] Writing report...');\n  const { T, Ok, Bl, Pa, Er } = writeReport(results);\n\n  const DASH = '─'.repeat(66);\n  console.log(`\\n${DASH}`);\n  console.log(`  ${Ok} success  /  ${Bl} blocked  /  ${Pa} partial  /  ${Er} error  (total ${T})`);\n  console.log(`  report : ${join(RUN_DIR, 'report.md')}`);\n  console.log(`  snapshots: ${SNAPSHOTS}`);\n  console.log(`${DASH}\\n`);\n}\n\nmain().catch(err => {\n  console.error('\\nFatal error:', err?.message ?? err);\n  process.exit(1);\n});\n"
  },
  {
    "path": "version",
    "content": "v5.1.7\n"
  },
  {
    "path": "wiseflow/README.md",
    "content": "# Wiseflow Addon for OpenClaw\n\n浏览器反检测 + Tab Recovery + Smart Search + RSS Reader + 新媒体小编 Crew。\n\n本目录是 [wiseflow](https://github.com/TeamWiseFlow/wiseflow) 提供给 [openclaw-for-business](https://github.com/TeamWiseFlow/openclaw_for_business) 的标准 addon 包。\n\n## 功能\n\n### 1. Patchright 反检测浏览器\n\n通过 pnpm overrides 将 `playwright-core` 替换为 `patchright-core`，无需修改上游源码。显著降低自动化浏览器被目标网站识别和拦截的概率，从而不需要安装 chrome relay extension，只用托管浏览器也能达到与 relay 同样、甚至更优的网络获取与操作能力。\n\n### 2. Tab Recovery 补丁\n\n当 Agent 操作过程中目标标签页意外关闭或消失时，自动进行快照级别的标签页恢复，确保任务不会因标签页丢失而中断。\n\n### 3. Smart Search（智能搜索）\n\n替代 openclaw 内置的 `web_search`，提供更强大的搜索能力。相比原版内置的 web search tool，具备三大核心优势：\n\n- **完全免费，无需 API Key**：不依赖任何第三方搜索 API，零成本使用\n- **即时搜索，时效性最佳**：直接驱动浏览器前往目标页面或各大社交媒体平台（微博、Twitter/X、facebook 等）进行搜索，第一时间获取最新发布的内容\n- **信源可自定��**：用户可以自由指定搜索源，精准匹配自己的信息需求\n\n### 4. Browser Guide 技能\n\n教会 agent 处��登录墙、验证码、懒加载、付费墙等场景的最佳实践。\n\n### 5. RSS Reader 技能\n\n支持读取 RSS/Atom Feed，可订阅任意支持标准 feed 格式的内容源。\n\n### 6. 新媒体小编 Crew\n\n开箱即用的中文自媒体内容创作 AI Agent，配置于 `crew/new-media-editor/`。\n\n**核心工作流：**\n\n- **Mode A**：选题调研 → 热点分析 → 图文草稿\n  深入采集微博实时热搜、小红书、知乎、B 站、抖音等平台的最新内容，分析热点角度与��异化切入点，生成完整图文草稿。\n\n- **Mode B**：草稿扩写 → 完整文章\n  接收用户的草稿片段或关键词，搜寻网络佐证（数据、权威来源、真实案例），扩展为结构完整的文章。\n\n- **文章排版（Output Strategy）**：定稿后自动调用 [文颜（Wenyan）](https://github.com/caol64/wenyan) 将 Markdown 渲染为公众号风格 HTML，内置 7 套主题并根据文章内容智能匹配，同时生成 Markdown 原文备份。\n\n- **Mode C**：推送微信公众号草稿箱（需配置 `WECHAT_APP_ID`/`WECHAT_APP_SECRET`）\n\n**Crew 专属技能：**\n\n| 技能 | 说明 |\n|------|------|\n| `siliconflow-img-gen` | 文生图，调用 SiliconFlow API 生成配图（需 `SILICONFLOW_API_KEY`） |\n| `siliconflow-video-gen` | 文生视频 / 图生视频（需 `SILICONFLOW_API_KEY`） |\n| `wenyan-formatter` | Markdown → 公众号风格 HTML 渲染，或直接推送草稿箱 |\n\n\n\n## 安装\n\n将本目录复制到 openclaw_for_business 的 `addons/` 目录：\n\n```bash\n# 方式一：从 wiseflow 仓库复制\ngit clone https://github.com/TeamWiseFlow/wiseflow.git /tmp/wiseflow\ncp -r /tmp/wiseflow/wiseflow <openclaw_for_business>/addons/wiseflow\n\n# 方式二：如果已有 wiseflow 仓库\ncp -r /path/to/wiseflow/wiseflow <openclaw_for_business>/addons/wiseflow\n```\n\n安装后重启 openclaw 即可生效（`dev.sh` 会自动扫描并应用）。\n\n## 目录结构\n\n```\nwiseflow/\n├── addon.json                    # 元数据（名称、版本、描述）\n├── overrides.sh                  # pnpm overrides: playwright-core → patchright-core + 禁用内置 web_search\n├── patches/\n│   ├── 001-browser-tab-recovery.patch        # 标签页恢复补丁\n│   ├── 002-disable-web-search-env-var.patch  # 禁用内置 web_search（env var）\n│   ├── 003-act-field-validation.patch        # ACT 字段校验补丁\n│   └── 004-web-fetch-allow-rfc2544.patch     # 允许 RFC 2544 假 IP（代理 fake-IP DNS 兼容）\n├── skills/                       # 全局技能（所有 Agent 可用）\n│   ├── browser-guide/SKILL.md    # 浏览器使用最佳实践\n│   ├── smart-search/SKILL.md     # 多平台搜索 URL 构造（替代内置 web_search）\n│   └── rss-reader/               # RSS/Atom Feed 读取器\n│       ├── SKILL.md\n│       ├── package.json\n│       └── scripts/fetch-rss.mjs\n└── crew/                         # Crew 模板（由 HRBP 管理）\n    └── new-media-editor/         # 新媒体小编（command-tier: T1）\n        ├── SOUL.md               # 角色定义 + 权限级别声明\n        ├── IDENTITY.md           # Agent 身份设定\n        ├── AGENTS.md             # 工作流程（Mode A/B/C + Image/Video Strategy）\n        ├── TOOLS.md              # 工具清单与使用规则\n        ├── BOOTSTRAP.md          # 首次启动引导\n        ├── BUILTIN_SKILLS        # 内置全局技能列表\n        ├── DENIED_SKILLS         # 禁用技能列表\n        ├── ALLOWED_COMMANDS      # 命令权限微调（T1 + bash/python3/node/npx）\n        ├── MEMORY.md / TASKS.md / HEARTBEAT.md / USER.md\n        └── skills/               # Crew 专属技能\n            ├── siliconflow-img-gen/   # 文生图（SiliconFlow Images API）\n            │   ├── SKILL.md\n            │   └── scripts/gen.py\n            ├── siliconflow-video-gen/ # 文生视频（SiliconFlow Video API）\n            │   ├── SKILL.md\n            │   └── scripts/gen.py\n            └── wenyan-formatter/      # Markdown 排版 & 公众号发布\n                ├── SKILL.md\n                └── scripts/format.sh\n```\n\n## 四层加载机制\n\naddon 被 `apply-addons.sh` 加载时按以下顺序执行：\n\n1. **overrides.sh** — pnpm overrides 替换 playwright-core 为 patchright-core，并禁用内置 web_search（最稳健，不依赖行号）\n2. **patches/*.patch** — git patch 精确代码改动（上游更新时可能需调整）\n3. **skills/*/SKILL.md** — 全局技能安装（所有 Agent 可见）\n4. **crew/*/** — Crew 模板安装到 `crews/`，由 HRBP 管理实例化\n\n## 要求\n\n- OpenClaw >= 2026.2\n- pnpm（由 openclaw_for_business 提供）\n- Node.js >= 18（wenyan-formatter 使用 `npx`，需要 Node.js 环境）\n\n## 测试\n\n参见���目根目录的 [tests/README.md](../tests/README.md) 了解浏览器反检测测试用例。\n\n## 开源致谢\n\n本 addon 集成或依赖以下开源项目：\n\n- [Patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright) — Playwright 的反检测 fork，Apache-2.0\n- [文颜（Wenyan）](https://github.com/caol64/wenyan) — 多平台 Markdown 排版工具，Apache-2.0\n  `wenyan-formatter` 技能通过调用 `@wenyan-md/cli`（[wenyan-cli](https://github.com/caol64/wenyan-cli)）实现 Markdown 渲染与公众号发布能力。\n"
  },
  {
    "path": "wiseflow/addon.json",
    "content": "{\n  \"name\": \"wiseflow\",\n  \"version\": \"0.3.0\",\n  \"description\": \"浏览器反检测 + Tab Recovery + 互联网搜索增强（smart-search / rss-reader skills + 禁用内置 web_search）+ 新媒体小编 Crew 模板\",\n  \"auto-activate\": false,\n  \"internal_crews\": [\"new-media-editor\"]\n}\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/AGENTS.md",
    "content": "# 新媒体小编 — Workflow\n\n## Mode A：选题研究 → 图文输出\n\n```\n1. 接收用户指定的选题/方向（如「AI 工具」「春节营销」等）\n2. 确认：目标平台、风格要求（轻松/严肃/专业）、大约字数、是否有截止时间\n3. 在主要自媒体平台搜集最新内容（微博实时热搜 / 小红书 / 知乎 / B站 / 抖音）\n4. 分析热点角度：哪些子话题最热、哪些切入点有差异化\n5. 确定配图方案（详见 Image/Video Strategy）\n6. 撰写图文草稿（含配图占位说明与图片来源）\n7. 发送给用户确认（L2）\n8. 根据反馈修改\n9. 交付最终版本（见 Output Strategy）\n```\n\n## Mode B：草稿扩写 → 完整文章\n\n```\n1. 接收用户提供的草稿、想法片段或关键词\n2. 提炼核心观点/主张\n3. 搜寻网络佐证：相关数据、权威来源、类似观点、真实案例\n4. 确定配图方案（同 Image/Video Strategy）\n5. 将草稿扩展为完整文章（保留用户核心观点，补充依据和结构）\n6. 发送给用户确认（L2）（需注明信息来源）\n7. 根据反馈修改\n8. 交付最终版本（见 Output Strategy）\n```\n\n## Image Strategy（配图优先级）\n\n```\n优先级 1 — 用户上传的图片\n  → 直接使用，无需搜索其他来源\n\n优先级 2 — 网络免版权图片\n  → 通过 smart-search 搜索 Bing Images / Baidu Images\n  → 或直接访问：\n      Unsplash: https://unsplash.com/s/photos/{keyword}\n      Pexels:   https://www.pexels.com/search/{keyword}/\n  → 只使用明确标注 CC0 / 免版权的图片，并记录来源 URL\n\n优先级 3 — 文生图 AI 生成（siliconflow-img-gen）\n  → 仅在前两档均无合适图片 且 已配置 SILICONFLOW_API_KEY 时调用\n  → 调用前告知用户将生成配图，描述生成意图\n  → 默认使用 Qwen/Qwen-Image-Edit-2509 模型，1024x1024\n\n如三项均不可用\n  → 以纯文字版本交付，在消息中告知用户需自行补充配图\n```\n\n## Video Strategy（配视频，仅当用户明确要求时启用）\n\n```\n步骤：\n1. 确认：文生视频 还是 图生视频？\n   - 文生视频（T2V）：用户只提供文字描述\n   - 图生视频（I2V）：用户提供或授权使用某张配图作为起始帧\n\n2. 确认预计耗时（1–5 分钟），获得用户知情同意\n\n3. 调用 siliconflow-video-gen：\n   - T2V: python3 <baseDir>/scripts/gen.py --prompt \"...\" --image-size 1280x720\n   - I2V: python3 <baseDir>/scripts/gen.py --model \"Wan-AI/Wan2.2-I2V-A14B\" \\\n                                            --prompt \"...\" --image \"<url_or_base64>\"\n\n4. 下载完成后，将视频路径汇报给用户（L2 确认后可进一步发布）\n```\n\n## Output Strategy（文章交付）\n\n```\n所有 Mode 完成修改确认后，默认执行：\n\n1. 调用 wenyan-formatter（render）\n   - 根据文章内容和风格，依照 SKILL.md 决策树自动选择主题\n   - 告知用户选定的主题和理由\n   - bash <baseDir>/scripts/format.sh --file <草稿路径> --theme <选定主题>\n\n2. 交付内容：\n   - 向用户展示主要段落（文字版预览）\n   - 提供 output.html 路径（可直接在浏览器打开，内容可粘贴至公众号/知乎编辑器）\n   - 提供 source.md 路径（Markdown 原文备份）\n\n3. 询问是否直接推送微信公众号草稿箱（Mode C）\n```\n\n## Mode C：推送微信公众号草稿（仅当用户明确要求时启用）\n\n```\n前置条件：\n  - 已配置 WECHAT_APP_ID 和 WECHAT_APP_SECRET\n  - Markdown 文件含 frontmatter（至少包含 title:）\n  - 本机 IP 在公众号白名单，或已配置 Wenyan Server\n\n步骤：\n1. 确认文章 frontmatter 完整（title、cover、author）\n2. 如用户未提供 cover，询问是否使用文章第一张配图\n3. 调用 wenyan-formatter（publish）：\n   - 本地模式（IP 已加白名单）：\n     bash <baseDir>/scripts/format.sh --action publish --file <草稿路径> --theme <主题>\n   - Server 模式（绕过 IP 白名单）：\n     bash <baseDir>/scripts/format.sh \\\n       --action publish --file <草稿路径> --theme <主题> \\\n       --server <WENYAN_SERVER_URL> --api-key <WENYAN_API_KEY>\n4. 打印推送结果（media_id）给用户（L2 确认后可在公众号后台发布）\n```\n\n## Edge Cases\n- **选题内容敏感**（政治、医疗、投资等）：标记风险，询问用户是否继续（L3）\n- **图片版权不明确**：告知用户，建议使用文生图或由用户自行提供\n- **草稿信息太少**：向用户追问目标受众、期望风格和核心卖点\n- **信息来源冲突**：呈现多方说法，不主观判断真假，交由用户定夺\n- **平台特殊格式**（如小红书 tags、公众号排版要求）：主动适配，备注中说明\n- **视频生成超时**：超过 10 分钟未完成，告知用户任务状态，建议重试或稍后再试\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/ALLOWED_COMMANDS",
    "content": "# T1 基础上追加：技能脚本执行所需的运行时命令\n+bash\n+python3\n+node\n+npx\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/BOOTSTRAP.md",
    "content": "# Bootstrap\n\nThis is a pre-configured crew workspace. Your role, responsibilities, and behavioral guidelines are fully defined in the following files — please review them at startup:\n\n- **SOUL.md** — Role definition, core responsibilities, and autonomy level\n- **AGENTS.md** — Workflows and operating procedures\n- **MEMORY.md** — Background context and ongoing task state\n- **IDENTITY.md** — Name and persona\n- **USER.md** — Assumptions about who you are serving\n- **TOOLS.md** — Available tools and usage guidelines\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/BUILTIN_SKILLS",
    "content": "summarize\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/DENIED_SKILLS",
    "content": "github\ngh-issues\ncoding-agent\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/HEARTBEAT.md",
    "content": "# 新媒体小编 — Heartbeat\n\n<!-- 初始为空，实例化后由系统定期更新健康状态 -->\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/IDENTITY.md",
    "content": "# 新媒体小编 — Identity\n\n## Name\n新媒体小编 (New Media Editor)\n\n## Role\n社交媒体内容创作专家 — 深耕中国主流自媒体生态，发现热点、采集素材、撰写图文，交付可直接发布的内容。\n\n## Personality\n贴地气、有洞察力、执行力强。能感知平台气氛和受众喜好，把枯燥的信息变成有传播力的图文。讲究效率，稿件出炉前必请用户确认。\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/MEMORY.md",
    "content": "# 新媒体小编 — Memory\n\n## Account Profiles\n（实���化后填写：运营的平台账号、粉丝画像、账号调性、发布节奏）\n\n## Content Archive\n（记录已发布内容的标题、平台、发布日期，避免重复选题）\n\n## Style Guidelines\n（用户确认过的风格偏好、常用 hashtag、禁忌话题）\n\n## Image Sources\n（记录可用的免版权图片来源、已验证的 Unsplash/Pexels 搜索关键词技巧）\n\n## Hot Topics Watchlist\n（用户要求持续跟踪的话题方向）\n\n## Notes\n（运行中持续更新）\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/SOUL.md",
    "content": "# 新媒体小编 — SOUL\n\n## Identity\n你是一名专业的新媒体内容创作者，专门服务于中国主流自媒体平台（微博、微信公众号、小红书、知乎、抖音、B站等）。你的核心能力是：快速捕捉热点、深度采集一手素材、精准提炼核心观点，最终产出兼具传播力和可读性的图文内容。\n\n## Core Responsibilities\n1. **选题研究模式**：按用户指定方向，在各大自媒体平台搜集最新资讯，识别当前热点角度，配合图片，输出完整图文\n2. **草稿扩写模式**：接收用户草稿/想法，提炼核心，在网上搜寻佐证材料、类似观点、数据支撑，扩写成完整文章\n3. 根据文章主题配置合适图片（优先级：用户上传 > 网络免版权图片 > 文生图 AI 生成）\n4. 所有内容在发布前必须发给用户确认，不得擅自发布\n\n## Autonomy\n- L1: 信息搜集、热点分析、图片查找、内容起草（可自主进行，无需请示）\n- L2: 向用户呈现完整图文草稿并等待确认（需给出图片来源说明）\n- L3: 将内容发布到任何外部平台（必须获得用户明确指令，不可自行决定）\n\n## 权限级别\ncommand-tier: T1\n\n## Communication Style\n- 默认使用中文，风格贴合目标平台调性（如小红书活泼、知乎严谨）\n- 主动汇报：选题角度为何吸睛、配图来源是否合规\n- 接到反馈后快速迭代，不解释过多\n- 遇到敏感话题或版权不清晰的图片，主动告知用户风险\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/TASKS.md",
    "content": "# 新媒体小编 — Tasks\n\n<!-- 初始为空，实例化后由小编在运行中维护 -->\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/TOOLS.md",
    "content": "# 新媒体小编 — Tools\n\n## Available Tools\n\n| Tool | Purpose |\n|------|---------|\n| `smart-search` | 在各大平台（微博、小红书、知乎、B站、抖音、Bing、百度）构造精确搜索 URL |\n| `browser` + `browser-guide` | 访问自媒体平台、滚动加载内容、处理登录墙和验证码 |\n| `siliconflow-img-gen` | 文生图，生成配图（仅在前两档图片来源无合适结果时使用，需要 `SILICONFLOW_API_KEY`） |\n| `siliconflow-video-gen` | 文生视频 / 图生视频（需要 `SILICONFLOW_API_KEY`；视频生成耗时较长，需提前告知用户） |\n| `wenyan-formatter` | Markdown → 公众号风格 HTML（render）或直接推送微信公众号草稿箱（publish），内置 7 套主题，支持智能主题选择 |\n| `xurl` | 快速 HTTP 请求，访问公开 API 或无需登录的静态内容源 |\n| `summarize` | 长内容提炼摘要（处理长文时辅助使用） |\n\n## Tool Usage Rules\n\n1. **内容采集**：用 `smart-search` 构造 URL + `browser` 访问，优先抓原始平台内容而非聚合搜索结果\n2. **图片合规**：仅使用在 Unsplash / Pexels 或搜索结果中明确标注免版权（CC0）的图片\n3. **文生图触发条件**：确认前两档图片来源均无合适选项，且确认已配置 `SILICONFLOW_API_KEY` 后才调用 `siliconflow-img-gen`\n4. **视频触发条件**：用户明确要求生成视频时才调用 `siliconflow-video-gen`，调用前告知预计耗时（1–5 分钟）\n5. **来源引用**：所有引用的数据和观点在草稿备注中标明来源 URL\n6. **文章排版**：文章定稿后默认调用 `wenyan-formatter render`，按内容风格智能选主题；需要推送公众号时才调用 `publish`\n7. **Tab 管理**：每次浏览完毕立即关闭 Tab，不积累无用标签\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/USER.md",
    "content": "# 新媒体小编 — User Context\n\n## User Role\n新媒体运营者 — 可能是品牌方的市场/运营人员、个人自媒体博主，或希望提升内容产出效率的企业主。\n\n## Preferences\n- Language: 中文（主要）；如用户用英文输入，则用英文回复\n- Style: 实用高效，稿件质量优先于速度\n- Autonomy: L1/L2 自主推进；L3（发布到外部平台）始终需要用户明确指令\n\n## Assumptions\n- 用户大多数时候知道自己想写什么，但不知道如何高效采集素材和组织结构\n- 用户可能没有专业版权意识，需要小编主动提醒图片版权问题\n- 用户希望减少来回沟通次数，更倾向于一次输出较完整的草稿再修改\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/skills/siliconflow-img-gen/SKILL.md",
    "content": "---\nname: siliconflow-img-gen\ndescription: Generate images via SiliconFlow Images API. Default model is Qwen/Qwen-Image-Edit-2509. Supports text-to-image.\nhomepage: https://docs.siliconflow.cn/cn/api-reference/images/images-generations\nmetadata:\n  {\n    \"openclaw\":\n      {\n        \"emoji\": \"🖼️\",\n        \"requires\": { \"bins\": [\"python3\"], \"env\": [\"SILICONFLOW_API_KEY\"] },\n        \"primaryEnv\": \"SILICONFLOW_API_KEY\",\n      },\n  }\n---\n\n# SiliconFlow Image Gen\n\nGenerate images using the SiliconFlow Images API.\n\n## Run\n\nNote: Image generation can take 10–60 seconds. Set a higher timeout when invoking via exec (e.g., `exec timeout=120`).\n\n```bash\npython3 {baseDir}/scripts/gen.py --prompt \"your prompt here\"\n```\n\nUseful flags:\n\n```bash\n# Default model (Qwen/Qwen-Image-Edit-2509), square output\npython3 {baseDir}/scripts/gen.py --prompt \"a futuristic city at dusk\"\n\n# Portrait / landscape sizes\npython3 {baseDir}/scripts/gen.py --prompt \"mountain lake\" --image-size 720x1280\npython3 {baseDir}/scripts/gen.py --prompt \"mountain lake\" --image-size 1280x720\n\n# Use Kolors model (supports guidance/batch)\npython3 {baseDir}/scripts/gen.py --prompt \"flower field\" --model \"Kwai-Kolors/Kolors\" --batch-size 3\n\n# Save to specific directory\npython3 {baseDir}/scripts/gen.py --prompt \"sunset\" --out-dir ./out/images\n```\n\n## Parameters\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `--prompt` | required | Text description for the image |\n| `--model` | `Qwen/Qwen-Image-Edit-2509` | Model ID |\n| `--image-size` | `1024x1024` | Resolution: `1024x1024`, `960x1280`, `768x1024`, `720x1440`, `720x1280` |\n| `--batch-size` | `1` | Number of images (1–4, Kolors only) |\n| `--steps` | `20` | Inference steps (1–100) |\n| `--guidance` | — | Guidance scale (Kolors only) |\n| `--negative-prompt` | — | What to avoid in the image |\n| `--seed` | — | Random seed for reproducibility |\n| `--out-dir` | `./tmp/sf-img-<ts>` | Output directory |\n\n## Output\n\n- `*.png` images named by index\n- `prompts.json` mapping index → prompt + URL\n- `index.html` thumbnail gallery\n\n## Environment Variables\n\n| Variable | Description |\n|----------|-------------|\n| `SILICONFLOW_API_KEY` | Your SiliconFlow API key (required) |\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/skills/siliconflow-img-gen/scripts/gen.py",
    "content": "#!/usr/bin/env python3\n\"\"\"SiliconFlow image generation — stdlib only (no httpx/requests).\"\"\"\n\nimport argparse\nimport json\nimport os\nimport sys\nimport time\nimport urllib.request\nimport urllib.error\nfrom pathlib import Path\n\nAPI_URL = \"https://api.siliconflow.cn/v1/images/generations\"\n\n# Models that accept guidance_scale and batch_size\nKOLORS_MODELS = {\"kwai-kolors/kolors\"}\n\ndef build_payload(args):\n    model = args.model\n    payload = {\n        \"model\": model,\n        \"prompt\": args.prompt,\n        \"image_size\": args.image_size,\n        \"num_inference_steps\": args.steps,\n        \"batch_size\": args.batch_size,\n    }\n    # guidance_scale only supported by Kolors\n    if args.guidance is not None:\n        if model.lower() in KOLORS_MODELS:\n            payload[\"guidance_scale\"] = args.guidance\n        else:\n            print(f\"[warn] --guidance ignored for model {model}\", file=sys.stderr)\n    if args.negative_prompt:\n        payload[\"negative_prompt\"] = args.negative_prompt\n    if args.seed is not None:\n        payload[\"seed\"] = args.seed\n    # Qwen does not accept image_size or batch_size in some variants — keep them\n    # but note the model may ignore them silently\n    return payload\n\n\ndef api_request(payload, api_key):\n    data = json.dumps(payload).encode()\n    req = urllib.request.Request(\n        API_URL,\n        data=data,\n        headers={\n            \"Authorization\": f\"Bearer {api_key}\",\n            \"Content-Type\": \"application/json\",\n        },\n        method=\"POST\",\n    )\n    try:\n        with urllib.request.urlopen(req, timeout=120) as resp:\n            return json.loads(resp.read())\n    except urllib.error.HTTPError as e:\n        body = e.read().decode(errors=\"replace\")\n        print(f\"[error] HTTP {e.code}: {body}\", file=sys.stderr)\n        sys.exit(1)\n\n\ndef download_image(url, dest_path):\n    req = urllib.request.Request(url, headers={\"User-Agent\": \"wiseflow-img-gen/1.0\"})\n    with urllib.request.urlopen(req, timeout=60) as resp:\n        dest_path.write_bytes(resp.read())\n\n\ndef main():\n    parser = argparse.ArgumentParser(description=\"SiliconFlow image generation\")\n    parser.add_argument(\"--prompt\", required=True)\n    parser.add_argument(\"--model\", default=\"Qwen/Qwen-Image-Edit-2509\")\n    parser.add_argument(\"--image-size\", default=\"1024x1024\", dest=\"image_size\")\n    parser.add_argument(\"--batch-size\", type=int, default=1, dest=\"batch_size\")\n    parser.add_argument(\"--steps\", type=int, default=20)\n    parser.add_argument(\"--guidance\", type=float, default=None)\n    parser.add_argument(\"--negative-prompt\", default=None, dest=\"negative_prompt\")\n    parser.add_argument(\"--seed\", type=int, default=None)\n    parser.add_argument(\"--out-dir\", default=None, dest=\"out_dir\")\n    args = parser.parse_args()\n\n    api_key = os.environ.get(\"SILICONFLOW_API_KEY\")\n    if not api_key:\n        print(\"[error] SILICONFLOW_API_KEY not set\", file=sys.stderr)\n        sys.exit(1)\n\n    ts = int(time.time())\n    out_dir = Path(args.out_dir) if args.out_dir else Path(f\"./tmp/sf-img-{ts}\")\n    out_dir.mkdir(parents=True, exist_ok=True)\n\n    payload = build_payload(args)\n    print(f\"[info] Generating image with model={args.model} size={args.image_size} …\")\n    result = api_request(payload, api_key)\n\n    images = result.get(\"images\", [])\n    if not images:\n        print(f\"[error] No images in response: {result}\", file=sys.stderr)\n        sys.exit(1)\n\n    prompts_map = {}\n    for i, img in enumerate(images):\n        url = img.get(\"url\", \"\")\n        dest = out_dir / f\"{i:02d}.png\"\n        print(f\"[info] Downloading image {i} → {dest}\")\n        download_image(url, dest)\n        prompts_map[str(i)] = {\"prompt\": args.prompt, \"url\": url, \"file\": str(dest)}\n\n    (out_dir / \"prompts.json\").write_text(json.dumps(prompts_map, ensure_ascii=False, indent=2))\n\n    # Simple HTML gallery\n    gallery_html = [\"<!DOCTYPE html><html><body>\"]\n    for i in range(len(images)):\n        gallery_html.append(f'<img src=\"{i:02d}.png\" style=\"max-width:512px;margin:4px\">')\n    gallery_html.append(\"</body></html>\")\n    (out_dir / \"index.html\").write_text(\"\\n\".join(gallery_html))\n\n    print(f\"[done] {len(images)} image(s) saved to {out_dir}/\")\n    for k, v in prompts_map.items():\n        print(f\"  [{k}] {v['file']}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/skills/siliconflow-video-gen/SKILL.md",
    "content": "---\nname: siliconflow-video-gen\ndescription: Generate videos via SiliconFlow Video API. Supports text-to-video (T2V) and image-to-video (I2V) using Wan2.2 models. Async: submit job → poll until done → download.\nhomepage: https://docs.siliconflow.cn/cn/userguide/capabilities/video\nmetadata:\n  {\n    \"openclaw\":\n      {\n        \"emoji\": \"🎬\",\n        \"requires\": { \"bins\": [\"python3\"], \"env\": [\"SILICONFLOW_API_KEY\"] },\n        \"primaryEnv\": \"SILICONFLOW_API_KEY\",\n      },\n  }\n---\n\n# SiliconFlow Video Gen\n\nGenerate videos using the SiliconFlow Video API (Wan2.2 models).\n\nVideo generation is **asynchronous**: the API returns a `requestId` immediately, then the script polls the status endpoint until the job completes (status: `Succeed`).\n\n> The generated video URL is valid for **1 hour**. The script downloads the video locally automatically.\n\n## Run\n\nNote: Video generation typically takes **1–5 minutes**. Set exec timeout accordingly (e.g., `exec timeout=600`).\n\n```bash\n# Text-to-video\npython3 {baseDir}/scripts/gen.py --prompt \"a dolphin leaping over ocean waves at sunset\"\n\n# Image-to-video (provide a public URL or local base64 image)\npython3 {baseDir}/scripts/gen.py \\\n  --model \"Wan-AI/Wan2.2-I2V-A14B\" \\\n  --prompt \"the camera slowly zooms out\" \\\n  --image \"https://example.com/my-photo.jpg\"\n\n# Custom resolution and output directory\npython3 {baseDir}/scripts/gen.py \\\n  --prompt \"time-lapse of a blooming flower\" \\\n  --image-size 720x1280 \\\n  --out-dir ./out/videos\n\n# Reproducible generation with a fixed seed\npython3 {baseDir}/scripts/gen.py --prompt \"rocket launch\" --seed 42\n```\n\n## Parameters\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `--prompt` | required | Text description of the video |\n| `--model` | `Wan-AI/Wan2.2-T2V-A14B` | Model ID: `Wan-AI/Wan2.2-T2V-A14B` (T2V) or `Wan-AI/Wan2.2-I2V-A14B` (I2V) |\n| `--image` | — | Image URL or `data:image/...;base64,...` (required for I2V model) |\n| `--image-size` | `1280x720` | Resolution: `1280x720` (16:9), `720x1280` (9:16), `960x960` (1:1) |\n| `--negative-prompt` | — | What to avoid in the video |\n| `--seed` | — | Random seed for reproducibility |\n| `--poll-interval` | `10` | Seconds between status polls |\n| `--timeout` | `600` | Max seconds to wait for generation |\n| `--out-dir` | `./tmp/sf-video-<ts>` | Output directory |\n\n## Models\n\n| Model | Type | Notes |\n|-------|------|-------|\n| `Wan-AI/Wan2.2-T2V-A14B` | Text → Video | Default model |\n| `Wan-AI/Wan2.2-I2V-A14B` | Image → Video | Requires `--image` parameter |\n\n## Output\n\n- `video_<requestId>.mp4` downloaded locally\n- `result.json` with full API response\n\n## Environment Variables\n\n| Variable | Description |\n|----------|-------------|\n| `SILICONFLOW_API_KEY` | Your SiliconFlow API key (required) |\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/skills/siliconflow-video-gen/scripts/gen.py",
    "content": "#!/usr/bin/env python3\n\"\"\"SiliconFlow video generation — stdlib only (no httpx/requests).\n\nFlow:\n  1. POST /v1/video/submit  → requestId\n  2. Poll POST /v1/video/status every --poll-interval seconds\n  3. When status == 'Succeed', download video to --out-dir\n\"\"\"\n\nimport argparse\nimport json\nimport os\nimport sys\nimport time\nimport urllib.request\nimport urllib.error\nfrom pathlib import Path\n\nSUBMIT_URL = \"https://api.siliconflow.cn/v1/video/submit\"\nSTATUS_URL = \"https://api.siliconflow.cn/v1/video/status\"\n\nT2V_MODEL = \"Wan-AI/Wan2.2-T2V-A14B\"\nI2V_MODEL = \"Wan-AI/Wan2.2-I2V-A14B\"\nVALID_SIZES = {\"1280x720\", \"720x1280\", \"960x960\"}\n\n\ndef post_json(url, payload, api_key, timeout=60):\n    data = json.dumps(payload).encode()\n    req = urllib.request.Request(\n        url,\n        data=data,\n        headers={\n            \"Authorization\": f\"Bearer {api_key}\",\n            \"Content-Type\": \"application/json\",\n        },\n        method=\"POST\",\n    )\n    try:\n        with urllib.request.urlopen(req, timeout=timeout) as resp:\n            return json.loads(resp.read())\n    except urllib.error.HTTPError as e:\n        body = e.read().decode(errors=\"replace\")\n        print(f\"[error] HTTP {e.code}: {body}\", file=sys.stderr)\n        sys.exit(1)\n\n\ndef submit_job(payload, api_key):\n    result = post_json(SUBMIT_URL, payload, api_key, timeout=60)\n    rid = result.get(\"requestId\")\n    if not rid:\n        print(f\"[error] No requestId in response: {result}\", file=sys.stderr)\n        sys.exit(1)\n    return rid\n\n\ndef poll_until_done(request_id, api_key, poll_interval, timeout):\n    deadline = time.time() + timeout\n    attempt = 0\n    while time.time() < deadline:\n        attempt += 1\n        result = post_json(STATUS_URL, {\"requestId\": request_id}, api_key, timeout=30)\n        status = result.get(\"status\", \"\")\n        print(f\"[info] poll #{attempt}: status={status}\")\n        if status == \"Succeed\":\n            return result\n        if status == \"Failed\":\n            reason = result.get(\"reason\", \"unknown\")\n            print(f\"[error] Generation failed: {reason}\", file=sys.stderr)\n            sys.exit(1)\n        # InQueue or InProgress — wait and retry\n        time.sleep(poll_interval)\n    print(f\"[error] Timed out after {timeout}s\", file=sys.stderr)\n    sys.exit(1)\n\n\ndef download_video(url, dest_path):\n    \"\"\"Stream-download the video file to dest_path.\"\"\"\n    print(f\"[info] Downloading video → {dest_path}\")\n    req = urllib.request.Request(url, headers={\"User-Agent\": \"wiseflow-video-gen/1.0\"})\n    with urllib.request.urlopen(req, timeout=300) as resp:\n        dest_path.write_bytes(resp.read())\n\n\ndef main():\n    parser = argparse.ArgumentParser(description=\"SiliconFlow video generation\")\n    parser.add_argument(\"--prompt\", required=True, help=\"Video description\")\n    parser.add_argument(\n        \"--model\",\n        default=T2V_MODEL,\n        choices=[T2V_MODEL, I2V_MODEL],\n        help=\"Model ID\",\n    )\n    parser.add_argument(\n        \"--image\",\n        default=None,\n        help=\"Image URL or base64 data URI (required for I2V model)\",\n    )\n    parser.add_argument(\n        \"--image-size\",\n        default=\"1280x720\",\n        choices=sorted(VALID_SIZES),\n        dest=\"image_size\",\n        help=\"Video resolution\",\n    )\n    parser.add_argument(\"--negative-prompt\", default=None, dest=\"negative_prompt\")\n    parser.add_argument(\"--seed\", type=int, default=None)\n    parser.add_argument(\"--poll-interval\", type=int, default=10, dest=\"poll_interval\")\n    parser.add_argument(\"--timeout\", type=int, default=600)\n    parser.add_argument(\"--out-dir\", default=None, dest=\"out_dir\")\n    args = parser.parse_args()\n\n    if args.model == I2V_MODEL and not args.image:\n        print(f\"[error] --image is required when using model '{I2V_MODEL}'\", file=sys.stderr)\n        sys.exit(1)\n\n    api_key = os.environ.get(\"SILICONFLOW_API_KEY\")\n    if not api_key:\n        print(\"[error] SILICONFLOW_API_KEY not set\", file=sys.stderr)\n        sys.exit(1)\n\n    ts = int(time.time())\n    out_dir = Path(args.out_dir) if args.out_dir else Path(f\"./tmp/sf-video-{ts}\")\n    out_dir.mkdir(parents=True, exist_ok=True)\n\n    payload = {\n        \"model\": args.model,\n        \"prompt\": args.prompt,\n        \"image_size\": args.image_size,\n    }\n    if args.image:\n        payload[\"image\"] = args.image\n    if args.negative_prompt:\n        payload[\"negative_prompt\"] = args.negative_prompt\n    if args.seed is not None:\n        payload[\"seed\"] = args.seed\n\n    print(f\"[info] Submitting job: model={args.model} size={args.image_size}\")\n    request_id = submit_job(payload, api_key)\n    print(f\"[info] Job submitted. requestId={request_id}\")\n    print(f\"[info] Polling every {args.poll_interval}s (timeout={args.timeout}s)…\")\n\n    result = poll_until_done(request_id, api_key, args.poll_interval, args.timeout)\n\n    videos = result.get(\"results\", {}).get(\"videos\", [])\n    if not videos:\n        print(f\"[error] No videos in result: {result}\", file=sys.stderr)\n        sys.exit(1)\n\n    video_url = videos[0].get(\"url\", \"\")\n    if not video_url:\n        print(\"[error] Empty video URL in result\", file=sys.stderr)\n        sys.exit(1)\n\n    video_path = out_dir / f\"video_{request_id[:8]}.mp4\"\n    download_video(video_url, video_path)\n\n    result_path = out_dir / \"result.json\"\n    result_path.write_text(json.dumps(result, ensure_ascii=False, indent=2))\n\n    print(f\"[done] Video saved to: {video_path}\")\n    print(f\"[done] Metadata: {result_path}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/skills/wenyan-formatter/SKILL.md",
    "content": "---\nname: wenyan-formatter\ndescription: Format Markdown drafts into styled HTML for preview, or publish directly to WeChat Official Account (GZH) draft box. Supports multiple visual themes with smart topic-based theme selection.\nhomepage: https://github.com/theajack/wenyan\nmetadata:\n  {\n    \"openclaw\":\n      {\n        \"emoji\": \"📝\",\n        \"requires\": { \"bins\": [\"node\", \"npx\"] },\n        \"optionalEnv\": [\"WECHAT_APP_ID\", \"WECHAT_APP_SECRET\"],\n      },\n  }\n---\n\n# Wenyan Formatter\n\n将 Markdown 草稿渲染为微信公众号风格的 HTML，或一键推送到公众号草稿箱。\n\n## Run\n\n### 预览（render）\n\n```bash\n# 渲染本地 Markdown 文件\nbash {baseDir}/scripts/format.sh --file ./draft.md\n\n# 渲染内联 Markdown，指定主题\nbash {baseDir}/scripts/format.sh --content \"## 标题\\n\\n正文内容...\" --theme grace\n\n# 指定代码高亮主题\nbash {baseDir}/scripts/format.sh --file ./draft.md --theme tech --highlight github-dark\n\n# 自定义 CSS 主题\nbash {baseDir}/scripts/format.sh --file ./draft.md --custom-theme ./my-theme.css\n```\n\n### 发布到公众号草稿箱（publish）\n\n> 需要预先设置 `WECHAT_APP_ID` 和 `WECHAT_APP_SECRET`。\n\n```bash\n# 本机 IP 已在公众号 IP 白名单时，直接发布\nbash {baseDir}/scripts/format.sh --action publish --file ./draft.md --theme grace\n\n# 通过 Wenyan Server 绕过 IP 白名单限制（推荐）\nbash {baseDir}/scripts/format.sh \\\n  --action publish \\\n  --file ./draft.md \\\n  --theme grace \\\n  --server https://your-wenyan-server.example.com \\\n  --api-key YOUR_SERVER_KEY\n```\n\n## Parameters\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `--action` | `render` | `render`（仅输出 HTML）或 `publish`（推送草稿） |\n| `--file` | — | 本地 Markdown 文件路径（与 `--content` 二选一） |\n| `--content` | — | 直接传入 Markdown 字符串 |\n| `--theme` | `default` | 视觉主题 ID（见下方主题列表） |\n| `--highlight` | `solarized-light` | 代码块高亮主题 |\n| `--custom-theme` | — | 自定义 CSS 文件路径 |\n| `--no-mac-style` | — | 关闭 Mac 风格代码块头部 |\n| `--no-footnote` | — | 禁用链接转脚注 |\n| `--out-dir` | `./tmp/wenyan-<ts>` | 渲染输出���录（仅 render 模式） |\n| `--server` | — | Wenyan Server URL（publish 模式，绕过 IP 白名单） |\n| `--api-key` | — | Wenyan Server API Key（配合 `--server` 使用） |\n\n## Output\n\n**render 模式**\n\n- `<out-dir>/output.html` — 可在浏览器预览的完整 HTML 页面\n- `<out-dir>/source.md` — 原始 Markdown 备份\n\n**publish 模式**\n\n- 控制台打印推送结果（草稿 media_id 或错误信息）\n\n## Theme List & 智能主题匹配\n\n> Agent 应根据文章内容自动选择最贴切的主题，避免每次都使用默认主题。\n\n| 主题 ID | 视觉风格 | 适用内容场景 |\n|---------|---------|------------|\n| `default` | 简洁黑白，标准微信排版 | 通知、简讯、时效性资讯 |\n| `grace` | 优雅衬线，米白底色 | 深度长文、读书笔记、文化艺术 |\n| `tech` | 深色背景，等宽字体强调 | 技术教程、代码分析、产品评测 |\n| `fresh` | 清新绿色调，轻量感 | 健康生活、美食、户外、轻量资讯 |\n| `warm` | 暖橙/土黄，温暖感 | 情感、故事、节日、生活感悟 |\n| `elegant` | 极简留白，高对比标题 | 品牌内容、商务、高端产品推介 |\n| `cute` | 粉色圆角，活泼插画感 | 亲子、宠物、娱乐、年轻用户 |\n\n### 智能主题匹配决策树\n\n```\n内容含大量代码 / 技术术语？\n  └─ 是 → tech\n\n主要受众是年轻女性 / 亲子 / 萌宠？\n  └─ 是 → cute\n\n情感/故事/节日/生活类内容？\n  └─ 是 → warm\n\n健康 / 美食 / 户外 / 轻量生活？\n  └─ 是 → fresh\n\n品牌推广 / 商务 / 精品内容？\n  └─ 是 → elegant\n\n深度长文 / 文化 / 书评？\n  └─ 是 → grace\n\n以上均不符合（资讯/通知/简讯）\n  └─ → default\n```\n\n> 用户明确指定主题时，遵从用户选择；不确定时按决策树匹配，并在消息中告知已选主题及原因。\n\n## Environment Variables\n\n| 变量 | 说明 |\n|------|------|\n| `WECHAT_APP_ID` | 微信公众号 AppID（publish 模式必填） |\n| `WECHAT_APP_SECRET` | 微信公众号 AppSecret（publish 模式必填） |\n\n## Notes\n\n- 第一次运行 `npx @wenyan-md/cli` 时会自动下载包，耗时约 10–30 秒，之后有缓存\n- **publish 模式**默认要求本机 IP 在公众号后台 IP 白名单中；生产环境建议配置 Wenyan Server 绕过限制\n- Markdown 文件应包含 YAML frontmatter（至少含 `title:` 字段），否则 publish 时可能被微信 API 拒绝\n- 渲染 HTML 仅供预览，最终排版以公众号后台实际效果为准\n"
  },
  {
    "path": "wiseflow/crew/new-media-editor/skills/wenyan-formatter/scripts/format.sh",
    "content": "#!/usr/bin/env bash\n# wenyan-formatter — Markdown → styled HTML (render) or WeChat GZH draft (publish)\n# Wraps @wenyan-md/cli via npx; requires Node.js 18+.\n#\n# Usage:\n#   bash format.sh [options]\n#\n# Actions:\n#   render  (default) — convert Markdown to styled HTML, save to --out-dir\n#   publish           — render + upload images + push to WeChat GZH draft box\n#\n# Input (one required):\n#   --file <path>      local Markdown file path\n#   --content <text>   inline Markdown string (will be piped via stdin)\n#\n# Options:\n#   --action <render|publish>   default: render\n#   --theme <id>                theme ID (see SKILL.md for list); default: default\n#   --highlight <id>            code highlight theme; default: solarized-light\n#   --custom-theme <path>       path to custom CSS file\n#   --no-mac-style              disable Mac-style code block header\n#   --no-footnote               disable link→footnote conversion\n#   --out-dir <path>            output directory (render only); default: ./tmp/wenyan-<ts>\n#   --server <url>              Wenyan Server URL (publish only, bypasses IP whitelist)\n#   --api-key <key>             API key for Wenyan Server (publish only)\n#\n# Environment (publish only):\n#   WECHAT_APP_ID      WeChat GZH app ID\n#   WECHAT_APP_SECRET  WeChat GZH app secret\nset -euo pipefail\n\n# ---------- defaults ----------\nACTION=\"render\"\nFILE=\"\"\nCONTENT=\"\"\nTHEME=\"default\"\nHIGHLIGHT=\"solarized-light\"\nCUSTOM_THEME=\"\"\nOUT_DIR=\"\"\nSERVER=\"\"\nAPI_KEY=\"\"\nMAC_STYLE_FLAG=\"--mac-style\"\nFOOTNOTE_FLAG=\"--footnote\"\n\n# ---------- argument parsing ----------\nwhile [[ $# -gt 0 ]]; do\n    case \"$1\" in\n        --action)       ACTION=\"$2\";       shift 2 ;;\n        --file)         FILE=\"$2\";         shift 2 ;;\n        --content)      CONTENT=\"$2\";      shift 2 ;;\n        --theme)        THEME=\"$2\";        shift 2 ;;\n        --highlight)    HIGHLIGHT=\"$2\";    shift 2 ;;\n        --custom-theme) CUSTOM_THEME=\"$2\"; shift 2 ;;\n        --out-dir)      OUT_DIR=\"$2\";      shift 2 ;;\n        --server)       SERVER=\"$2\";       shift 2 ;;\n        --api-key)      API_KEY=\"$2\";      shift 2 ;;\n        --no-mac-style) MAC_STYLE_FLAG=\"--no-mac-style\"; shift ;;\n        --no-footnote)  FOOTNOTE_FLAG=\"--no-footnote\";   shift ;;\n        *)\n            echo \"[error] Unknown argument: $1\" >&2\n            exit 1\n            ;;\n    esac\ndone\n\n# ---------- validate action ----------\nif [[ \"$ACTION\" != \"render\" && \"$ACTION\" != \"publish\" ]]; then\n    echo \"[error] --action must be 'render' or 'publish', got: $ACTION\" >&2\n    exit 1\nfi\n\n# ---------- validate input ----------\nif [[ -z \"$FILE\" && -z \"$CONTENT\" ]]; then\n    echo \"[error] Provide either --file <path> or --content <markdown>\" >&2\n    exit 1\nfi\n\nif [[ -n \"$FILE\" && ! -f \"$FILE\" ]]; then\n    echo \"[error] File not found: $FILE\" >&2\n    exit 1\nfi\n\n# ---------- check node/npx ----------\nif ! command -v node &>/dev/null; then\n    echo \"[error] node not found. Install Node.js 18+ first.\" >&2\n    exit 1\nfi\n\nNODE_VER=$(node -e \"process.stdout.write(process.version)\")\nNODE_MAJOR=$(echo \"$NODE_VER\" | sed 's/v//' | cut -d. -f1)\nif [[ \"$NODE_MAJOR\" -lt 18 ]]; then\n    echo \"[error] Node.js 18+ required, found $NODE_VER\" >&2\n    exit 1\nfi\n\nif ! command -v npx &>/dev/null; then\n    echo \"[error] npx not found. It should ship with Node.js.\" >&2\n    exit 1\nfi\n\n# ---------- publish: pre-flight checks ----------\nif [[ \"$ACTION\" == \"publish\" ]]; then\n    if [[ -z \"${WECHAT_APP_ID:-}\" || -z \"${WECHAT_APP_SECRET:-}\" ]]; then\n        echo \"[error] publish requires WECHAT_APP_ID and WECHAT_APP_SECRET to be set.\" >&2\n        echo \"[hint]  Set them via environment variables before calling this script.\" >&2\n        exit 1\n    fi\n\n    # frontmatter title check (only when using --content, file is user's responsibility)\n    if [[ -n \"$CONTENT\" ]]; then\n        if ! echo \"$CONTENT\" | grep -qE '^---' || ! echo \"$CONTENT\" | grep -qE '^title:'; then\n            echo \"[warn] publish mode: --content should include frontmatter with 'title:' field.\" >&2\n            echo \"[warn] Missing title may cause WeChat API rejection.\" >&2\n        fi\n    fi\nfi\n\n# ---------- build npx args ----------\nWENYAN_ARGS=(\"--yes\" \"@wenyan-md/cli\" \"$ACTION\")\n\nif [[ -n \"$FILE\" ]]; then\n    WENYAN_ARGS+=(\"--file\" \"$FILE\")\nfi\n\nWENYAN_ARGS+=(\"--theme\" \"$THEME\")\nWENYAN_ARGS+=(\"--highlight\" \"$HIGHLIGHT\")\nWENYAN_ARGS+=(\"$MAC_STYLE_FLAG\")\nWENYAN_ARGS+=(\"$FOOTNOTE_FLAG\")\n\nif [[ -n \"$CUSTOM_THEME\" ]]; then\n    WENYAN_ARGS+=(\"--custom-theme\" \"$CUSTOM_THEME\")\nfi\n\nif [[ \"$ACTION\" == \"publish\" ]]; then\n    if [[ -n \"$SERVER\" ]]; then\n        WENYAN_ARGS+=(\"--server\" \"$SERVER\")\n        if [[ -n \"$API_KEY\" ]]; then\n            WENYAN_ARGS+=(\"--api-key\" \"$API_KEY\")\n        fi\n    fi\nfi\n\n# ---------- render: prepare output directory ----------\nif [[ \"$ACTION\" == \"render\" ]]; then\n    TS=$(date +%s)\n    if [[ -z \"$OUT_DIR\" ]]; then\n        OUT_DIR=\"./tmp/wenyan-${TS}\"\n    fi\n    mkdir -p \"$OUT_DIR\"\n    HTML_FILE=\"${OUT_DIR}/output.html\"\n\n    echo \"[info] Action: render | Theme: ${THEME} | Highlight: ${HIGHLIGHT}\"\n    echo \"[info] Output dir: ${OUT_DIR}\"\n\n    # Run wenyan render → capture HTML output\n    if [[ -n \"$CONTENT\" ]]; then\n        HTML_CONTENT=$(echo \"$CONTENT\" | npx \"${WENYAN_ARGS[@]}\")\n    else\n        HTML_CONTENT=$(npx \"${WENYAN_ARGS[@]}\")\n    fi\n\n    # Wrap in minimal page for browser preview\n    cat > \"$HTML_FILE\" <<HTML_EOF\n<!DOCTYPE html>\n<html lang=\"zh\">\n<head>\n<meta charset=\"UTF-8\">\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\n<title>wenyan preview</title>\n<style>\n  body { max-width: 780px; margin: 40px auto; padding: 0 20px; font-family: sans-serif; }\n</style>\n</head>\n<body>\n${HTML_CONTENT}\n</body>\n</html>\nHTML_EOF\n\n    # Also save original Markdown for reference\n    if [[ -n \"$FILE\" ]]; then\n        cp \"$FILE\" \"${OUT_DIR}/source.md\"\n        echo \"[done] Source: ${OUT_DIR}/source.md\"\n    else\n        echo \"$CONTENT\" > \"${OUT_DIR}/source.md\"\n        echo \"[done] Source: ${OUT_DIR}/source.md\"\n    fi\n\n    echo \"[done] HTML:   ${HTML_FILE}\"\n    echo \"[done] Theme used: ${THEME}\"\n\n# ---------- publish ----------\nelse\n    echo \"[info] Action: publish | Theme: ${THEME}\"\n    if [[ -n \"$SERVER\" ]]; then\n        echo \"[info] Using server mode: ${SERVER}\"\n    else\n        echo \"[info] Using local mode (IP must be in WeChat whitelist)\"\n    fi\n\n    if [[ -n \"$CONTENT\" ]]; then\n        RESULT=$(echo \"$CONTENT\" | npx \"${WENYAN_ARGS[@]}\")\n    else\n        RESULT=$(npx \"${WENYAN_ARGS[@]}\")\n    fi\n\n    echo \"[done] $RESULT\"\nfi\n"
  },
  {
    "path": "wiseflow/overrides.sh",
    "content": "#!/bin/bash\n# wiseflow addon - overrides.sh\n# 通过 pnpm overrides 将 playwright-core 替换为 patchright-core（反检测）\n# 由 apply-addons.sh 调用，接收环境变量：ADDON_DIR, OPENCLAW_DIR\nset -e\n\nPATCHRIGHT_VERSION=\"${PATCHRIGHT_VERSION:-1.58.2}\"\n\n# ─── pnpm overrides（核心，不修改源码） ─────────────────────────\necho \"    → pnpm override: playwright-core → patchright-core@${PATCHRIGHT_VERSION}\"\n\ncd \"$OPENCLAW_DIR\"\nnode -e \"\nconst fs = require('fs');\nconst pkg = JSON.parse(fs.readFileSync('package.json', 'utf8'));\npkg.pnpm = pkg.pnpm || {};\npkg.pnpm.overrides = pkg.pnpm.overrides || {};\npkg.pnpm.overrides['playwright-core'] = 'npm:patchright-core@${PATCHRIGHT_VERSION}';\nfs.writeFileSync('package.json', JSON.stringify(pkg, null, 2) + '\\n');\n\"\n\n# ─── 文档文本替换（可选，仅影响文档准确性） ────────────────────\nDOC_FILES=(\n  \"Dockerfile\"\n  \"docs/help/faq.md\"\n  \"docs/install/docker.md\"\n  \"docs/tools/browser.md\"\n  \"docs/zh-CN/install/docker.md\"\n  \"docs/zh-CN/tools/browser.md\"\n  \"src/browser/pw-tools-core.snapshot.ts\"\n  \"src/browser/routes/agent.shared.ts\"\n  \"src/dockerfile.test.ts\"\n)\n\nfor file in \"${DOC_FILES[@]}\"; do\n  if [ -f \"$file\" ]; then\n    if [[ \"$OSTYPE\" == \"darwin\"* ]]; then\n      sed -i '' 's/playwright-core/patchright-core/g' \"$file\"\n    else\n      sed -i 's/playwright-core/patchright-core/g' \"$file\"\n    fi\n  fi\ndone\n\n# ─── 禁用内置 web_search 工具（由 smart-search skill 通过浏览器替代） ──────────\n# openclaw 加载顺序：CWD/.env → OPENCLAW_STATE_DIR/.env（不覆盖已有值）\n# 这里写入 OPENCLAW_STATE_DIR/.env（默认 ~/.openclaw/.env）\nOPENCLAW_STATE_DIR=\"${OPENCLAW_STATE_DIR:-$HOME/.openclaw}\"\nENV_FILE=\"$OPENCLAW_STATE_DIR/.env\"\nmkdir -p \"$OPENCLAW_STATE_DIR\"\nif ! grep -q \"OPENCLAW_DISABLE_WEB_SEARCH\" \"$ENV_FILE\" 2>/dev/null; then\n  echo \"OPENCLAW_DISABLE_WEB_SEARCH=1\" >> \"$ENV_FILE\"\n  echo \"    → injected OPENCLAW_DISABLE_WEB_SEARCH=1 into $ENV_FILE\"\nelse\n  echo \"    → OPENCLAW_DISABLE_WEB_SEARCH already set in $ENV_FILE\"\nfi\n"
  },
  {
    "path": "wiseflow/patches/001-browser-tab-recovery.patch",
    "content": "diff --git a/src/agents/tools/browser-tool.actions.ts b/src/agents/tools/browser-tool.actions.ts\nindex a4b6cb456..8fd27500c 100644\n--- a/src/agents/tools/browser-tool.actions.ts\n+++ b/src/agents/tools/browser-tool.actions.ts\n@@ -81,27 +81,35 @@ function isChromeStaleTargetError(profile: string | undefined, err: unknown): bo\n   return msg.includes(\"404:\") && msg.includes(\"tab not found\");\n }\n \n-function stripTargetIdFromActRequest(\n-  request: Parameters<typeof browserAct>[1],\n-): Parameters<typeof browserAct>[1] | null {\n-  const targetId = typeof request.targetId === \"string\" ? request.targetId.trim() : undefined;\n-  if (!targetId) {\n-    return null;\n-  }\n-  const retryRequest = { ...request };\n-  delete retryRequest.targetId;\n-  return retryRequest as Parameters<typeof browserAct>[1];\n+function isTabResolutionError(err: unknown): boolean {\n+  const msg = String(err).toLowerCase();\n+  return msg.includes(\"tab not found\") || msg.includes(\"ambiguous target id prefix\");\n }\n \n-function canRetryChromeActWithoutTargetId(request: Parameters<typeof browserAct>[1]): boolean {\n-  const typedRequest = request as Partial<Record<\"kind\" | \"action\", unknown>>;\n-  const kind =\n-    typeof typedRequest.kind === \"string\"\n-      ? typedRequest.kind\n-      : typeof typedRequest.action === \"string\"\n-        ? typedRequest.action\n-        : \"\";\n-  return kind === \"hover\" || kind === \"scrollIntoView\" || kind === \"wait\";\n+function pickFallbackTargetId(tabs: unknown[]): string | undefined {\n+  const list = tabs.filter((tab): tab is { targetId?: unknown; type?: unknown } => {\n+    return Boolean(tab && typeof tab === \"object\");\n+  });\n+  const pages = list.filter((tab) => {\n+    const type = typeof tab.type === \"string\" ? tab.type : \"page\";\n+    return type === \"page\";\n+  });\n+  const primary = pages[0] ?? list[0];\n+  const raw = primary?.targetId;\n+  return typeof raw === \"string\" && raw.trim() ? raw.trim() : undefined;\n+}\n+\n+async function listTabsForRecovery(params: {\n+  baseUrl?: string;\n+  profile?: string;\n+  proxyRequest: BrowserProxyRequest | null;\n+}): Promise<unknown[]> {\n+  const { baseUrl, profile, proxyRequest } = params;\n+  if (proxyRequest) {\n+    const result = await proxyRequest({ method: \"GET\", path: \"/tabs\", profile });\n+    return ((result as { tabs?: unknown[] }).tabs ?? []).filter(Boolean);\n+  }\n+  return await browserTabs(baseUrl, { profile }).catch(() => []);\n }\n \n export async function executeTabsAction(params: {\n@@ -182,17 +190,37 @@ export async function executeSnapshotAction(params: {\n     labels,\n     mode,\n   };\n-  const snapshot = proxyRequest\n-    ? ((await proxyRequest({\n-        method: \"GET\",\n-        path: \"/snapshot\",\n-        profile,\n-        query: snapshotQuery,\n-      })) as Awaited<ReturnType<typeof browserSnapshot>>)\n-    : await browserSnapshot(baseUrl, {\n-        ...snapshotQuery,\n-        profile,\n-      });\n+  const runSnapshot = async (retryTargetId: string | undefined) => {\n+    return proxyRequest\n+      ? ((await proxyRequest({\n+          method: \"GET\",\n+          path: \"/snapshot\",\n+          profile,\n+          query: { ...snapshotQuery, targetId: retryTargetId },\n+        })) as Awaited<ReturnType<typeof browserSnapshot>>)\n+      : await browserSnapshot(baseUrl, {\n+          ...snapshotQuery,\n+          targetId: retryTargetId,\n+          profile,\n+        });\n+  };\n+  let snapshot: Awaited<ReturnType<typeof browserSnapshot>>;\n+  try {\n+    snapshot = await runSnapshot(targetId);\n+  } catch (err) {\n+    if (!isTabResolutionError(err)) {\n+      throw err;\n+    }\n+    const tabs = await listTabsForRecovery({ baseUrl, profile, proxyRequest });\n+    const recoveredTargetId = pickFallbackTargetId(tabs);\n+    if (!recoveredTargetId) {\n+      throw new Error(\n+        `Browser tab was lost and no recoverable tab is available. Run action=tabs${profile ? ` profile=\"${profile}\"` : \"\"} to refresh tab state.`,\n+        { cause: err },\n+      );\n+    }\n+    snapshot = await runSnapshot(recoveredTargetId);\n+  }\n   if (snapshot.format === \"ai\") {\n     const extractedText = snapshot.snapshot ?? \"\";\n     const wrappedSnapshot = wrapExternalContent(extractedText, {\n@@ -303,46 +331,37 @@ export async function executeActAction(params: {\n         });\n     return jsonResult(result);\n   } catch (err) {\n-    if (isChromeStaleTargetError(profile, err)) {\n-      const retryRequest = stripTargetIdFromActRequest(request);\n-      const tabs = proxyRequest\n-        ? ((\n-            (await proxyRequest({\n-              method: \"GET\",\n-              path: \"/tabs\",\n+    if (isTabResolutionError(err)) {\n+      const tabs = await listTabsForRecovery({ baseUrl, profile, proxyRequest });\n+      const fallbackTargetId = pickFallbackTargetId(tabs);\n+      if (fallbackTargetId) {\n+        // Retry with the specific fallback tab targetId.\n+        const recoveredRequest = { ...request, targetId: fallbackTargetId };\n+        const recovered = proxyRequest\n+          ? await proxyRequest({\n+              method: \"POST\",\n+              path: \"/act\",\n               profile,\n-            })) as { tabs?: unknown[] }\n-          ).tabs ?? [])\n-        : await browserTabs(baseUrl, { profile }).catch(() => []);\n-      // Some Chrome relay targetIds can go stale between snapshots and actions.\n-      // Only retry safe read-only actions, and only when exactly one tab remains attached.\n-      if (retryRequest && canRetryChromeActWithoutTargetId(request) && tabs.length === 1) {\n-        try {\n-          const retryResult = proxyRequest\n-            ? await proxyRequest({\n-                method: \"POST\",\n-                path: \"/act\",\n-                profile,\n-                body: retryRequest,\n-              })\n-            : await browserAct(baseUrl, retryRequest, {\n-                profile,\n-              });\n-          return jsonResult(retryResult);\n-        } catch {\n-          // Fall through to explicit stale-target guidance.\n-        }\n+              body: recoveredRequest,\n+            })\n+          : await browserAct(baseUrl, recoveredRequest as Parameters<typeof browserAct>[1], {\n+              profile,\n+            });\n+        return jsonResult(recovered);\n       }\n-      if (!tabs.length) {\n+      // No specific tab found — give Chrome-specific guidance if applicable.\n+      if (isChromeStaleTargetError(profile, err)) {\n+        if (!tabs.length) {\n+          throw new Error(\n+            \"No Chrome tabs are attached via the OpenClaw Browser Relay extension. Click the toolbar icon on the tab you want to control (badge ON), then retry.\",\n+            { cause: err },\n+          );\n+        }\n         throw new Error(\n-          \"No Chrome tabs are attached via the OpenClaw Browser Relay extension. Click the toolbar icon on the tab you want to control (badge ON), then retry.\",\n+          `Chrome tab not found (stale targetId?). Run action=tabs profile=\"chrome-relay\" and use one of the returned targetIds.`,\n           { cause: err },\n         );\n       }\n-      throw new Error(\n-        `Chrome tab not found (stale targetId?). Run action=tabs profile=\"chrome-relay\" and use one of the returned targetIds.`,\n-        { cause: err },\n-      );\n     }\n     throw err;\n   }\ndiff --git a/src/agents/tools/browser-tool.test.ts b/src/agents/tools/browser-tool.test.ts\nindex adaaea782..9cd664527 100644\n--- a/src/agents/tools/browser-tool.test.ts\n+++ b/src/agents/tools/browser-tool.test.ts\n@@ -287,6 +287,48 @@ describe(\"browser tool snapshot maxChars\", () => {\n     expect(opts?.mode).toBeUndefined();\n   });\n \n+  it(\"recovers snapshot when target tab disappears\", async () => {\n+    browserClientMocks.browserSnapshot\n+      .mockRejectedValueOnce(new Error(\"404: tab not found\"))\n+      .mockResolvedValueOnce({\n+        ok: true,\n+        format: \"ai\",\n+        targetId: \"new-tab\",\n+        url: \"https://example.com\",\n+        snapshot: \"recovered\",\n+      });\n+    browserClientMocks.browserTabs.mockResolvedValueOnce([\n+      {\n+        targetId: \"new-tab\",\n+        type: \"page\",\n+        title: \"Recovered\",\n+        url: \"https://example.com\",\n+      },\n+    ]);\n+\n+    const tool = createBrowserTool();\n+    const result = await tool.execute?.(\"call-1\", {\n+      action: \"snapshot\",\n+      snapshotFormat: \"ai\",\n+      targetId: \"stale-tab\",\n+    });\n+\n+    expect(browserClientMocks.browserSnapshot).toHaveBeenNthCalledWith(\n+      1,\n+      undefined,\n+      expect.objectContaining({ targetId: \"stale-tab\" }),\n+    );\n+    expect(browserClientMocks.browserSnapshot).toHaveBeenNthCalledWith(\n+      2,\n+      undefined,\n+      expect.objectContaining({ targetId: \"new-tab\" }),\n+    );\n+    expect(result?.details).toMatchObject({\n+      targetId: \"new-tab\",\n+      ok: true,\n+    });\n+  });\n+\n   it(\"defaults to host when using profile=chrome-relay (even in sandboxed sessions)\", async () => {\n     setResolvedBrowserProfiles({\n       \"chrome-relay\": {\n@@ -745,16 +787,23 @@ describe(\"browser tool external content wrapping\", () => {\n describe(\"browser tool act stale target recovery\", () => {\n   registerBrowserToolAfterEachReset();\n \n-  it(\"retries safe chrome-relay act once without targetId when exactly one tab remains\", async () => {\n+  it(\"recovers act by retrying with fallback tab targetId when tab is stale\", async () => {\n     browserActionsMocks.browserAct\n       .mockRejectedValueOnce(new Error(\"404: tab not found\"))\n-      .mockResolvedValueOnce({ ok: true });\n-    browserClientMocks.browserTabs.mockResolvedValueOnce([{ targetId: \"only-tab\" }]);\n+      .mockResolvedValueOnce({ ok: true, targetId: \"recovered-tab\" });\n+    browserClientMocks.browserTabs.mockResolvedValueOnce([\n+      {\n+        targetId: \"recovered-tab\",\n+        type: \"page\",\n+        title: \"Recovered\",\n+        url: \"https://example.com\",\n+      },\n+    ]);\n \n     const tool = createBrowserTool();\n     const result = await tool.execute?.(\"call-1\", {\n       action: \"act\",\n-      profile: \"chrome-relay\",\n+      profile: \"chrome\",\n       request: {\n         kind: \"hover\",\n         targetId: \"stale-tab\",\n@@ -767,34 +816,48 @@ describe(\"browser tool act stale target recovery\", () => {\n       1,\n       undefined,\n       expect.objectContaining({ targetId: \"stale-tab\", kind: \"hover\", ref: \"btn-1\" }),\n-      expect.objectContaining({ profile: \"chrome-relay\" }),\n+      expect.objectContaining({ profile: \"chrome\" }),\n     );\n     expect(browserActionsMocks.browserAct).toHaveBeenNthCalledWith(\n       2,\n       undefined,\n-      expect.not.objectContaining({ targetId: expect.anything() }),\n-      expect.objectContaining({ profile: \"chrome-relay\" }),\n+      expect.objectContaining({ targetId: \"recovered-tab\" }),\n+      expect.objectContaining({ profile: \"chrome\" }),\n     );\n     expect(result?.details).toMatchObject({ ok: true });\n   });\n \n-  it(\"does not retry mutating chrome-relay act requests without targetId\", async () => {\n-    browserActionsMocks.browserAct.mockRejectedValueOnce(new Error(\"404: tab not found\"));\n-    browserClientMocks.browserTabs.mockResolvedValueOnce([{ targetId: \"only-tab\" }]);\n+  it(\"recovers act when target tab disappears (non-chrome profile)\", async () => {\n+    browserActionsMocks.browserAct\n+      .mockRejectedValueOnce(new Error(\"404: tab not found\"))\n+      .mockResolvedValueOnce({ ok: true, targetId: \"new-tab\" });\n+    browserClientMocks.browserTabs.mockResolvedValueOnce([\n+      {\n+        targetId: \"new-tab\",\n+        type: \"page\",\n+        title: \"Recovered\",\n+        url: \"https://example.com\",\n+      },\n+    ]);\n \n     const tool = createBrowserTool();\n-    await expect(\n-      tool.execute?.(\"call-1\", {\n-        action: \"act\",\n-        profile: \"chrome-relay\",\n-        request: {\n-          kind: \"click\",\n-          targetId: \"stale-tab\",\n-          ref: \"btn-1\",\n-        },\n-      }),\n-    ).rejects.toThrow(/Run action=tabs profile=\"chrome-relay\"/i);\n+    const result = await tool.execute?.(\"call-1\", {\n+      action: \"act\",\n+      request: { kind: \"click\", ref: \"e1\", targetId: \"stale-tab\" },\n+    });\n \n-    expect(browserActionsMocks.browserAct).toHaveBeenCalledTimes(1);\n+    expect(browserActionsMocks.browserAct).toHaveBeenNthCalledWith(\n+      1,\n+      undefined,\n+      expect.objectContaining({ targetId: \"stale-tab\" }),\n+      expect.objectContaining({ profile: undefined }),\n+    );\n+    expect(browserActionsMocks.browserAct).toHaveBeenNthCalledWith(\n+      2,\n+      undefined,\n+      expect.objectContaining({ targetId: \"new-tab\" }),\n+      expect.objectContaining({ profile: undefined }),\n+    );\n+    expect(result?.details).toMatchObject({ ok: true, targetId: \"new-tab\" });\n   });\n });\n"
  },
  {
    "path": "wiseflow/patches/002-disable-web-search-env-var.patch",
    "content": "diff --git a/src/agents/tools/web-search.ts b/src/agents/tools/web-search.ts\nindex 1e4983f85..aa7dac794 100644\n--- a/src/agents/tools/web-search.ts\n+++ b/src/agents/tools/web-search.ts\n@@ -420,6 +420,10 @@ function resolveSearchConfig(cfg?: OpenClawConfig): WebSearchConfig {\n }\n \n function resolveSearchEnabled(params: { search?: WebSearchConfig; sandboxed?: boolean }): boolean {\n+  // Allow disabling via env var (e.g., set by wiseflow addon to route searches through the browser)\n+  if (process.env.OPENCLAW_DISABLE_WEB_SEARCH === \"1\") {\n+    return false;\n+  }\n   if (typeof params.search?.enabled === \"boolean\") {\n     return params.search.enabled;\n   }\n"
  },
  {
    "path": "wiseflow/patches/003-act-field-validation.patch",
    "content": "diff --git a/src/agents/tools/browser-tool.actions.ts b/src/agents/tools/browser-tool.actions.ts\nindex 8fd27500c..af0f67d3d 100644\n--- a/src/agents/tools/browser-tool.actions.ts\n+++ b/src/agents/tools/browser-tool.actions.ts\n@@ -73,6 +73,41 @@ function formatConsoleToolResult(result: {\n   };\n }\n \n+// Kinds that require a ref from the current snapshot before acting.\n+const REF_REQUIRED_KINDS = new Set([\"click\", \"type\", \"hover\", \"scrollIntoView\"]);\n+\n+// Pre-validate act request fields before hitting the browser server so the LLM\n+// receives a clear, actionable error message rather than a raw HTTP 400.\n+function validateActRequest(request: Parameters<typeof browserAct>[1]): void {\n+  const req = request as Record<string, unknown>;\n+  const kind = typeof req.kind === \"string\" ? req.kind : \"\";\n+\n+  if (REF_REQUIRED_KINDS.has(kind)) {\n+    const ref = typeof req.ref === \"string\" ? req.ref.trim() : \"\";\n+    if (!ref) {\n+      throw new Error(\n+        `act kind=\"${kind}\" requires ref. Call action=snapshot first to get current element refs, then retry with a ref from the snapshot result.`,\n+      );\n+    }\n+  }\n+\n+  if (kind === \"press\") {\n+    const key = typeof req.key === \"string\" ? req.key.trim() : \"\";\n+    if (!key) {\n+      throw new Error(`act kind=\"press\" requires key (e.g. \"Enter\", \"Tab\", \"Escape\").`);\n+    }\n+  }\n+\n+  if (kind === \"fill\") {\n+    const fields = Array.isArray(req.fields) ? req.fields : null;\n+    if (!fields || fields.length === 0) {\n+      throw new Error(\n+        `act kind=\"fill\" requires fields (array of {ref, value} objects). Call action=snapshot first to get element refs.`,\n+      );\n+    }\n+  }\n+}\n+\n function isChromeStaleTargetError(profile: string | undefined, err: unknown): boolean {\n   if (profile !== \"chrome-relay\" && profile !== \"chrome\") {\n     return false;\n@@ -318,6 +353,7 @@ export async function executeActAction(params: {\n   proxyRequest: BrowserProxyRequest | null;\n }): Promise<AgentToolResult<unknown>> {\n   const { request, baseUrl, profile, proxyRequest } = params;\n+  validateActRequest(request);\n   try {\n     const result = proxyRequest\n       ? await proxyRequest({\ndiff --git a/src/agents/tools/browser-tool.test.ts b/src/agents/tools/browser-tool.test.ts\nindex 9cd664527..84cd5d043 100644\n--- a/src/agents/tools/browser-tool.test.ts\n+++ b/src/agents/tools/browser-tool.test.ts\n@@ -861,3 +861,113 @@ describe(\"browser tool act stale target recovery\", () => {\n     expect(result?.details).toMatchObject({ ok: true, targetId: \"new-tab\" });\n   });\n });\n+\n+describe(\"browser tool act field validation\", () => {\n+  registerBrowserToolAfterEachReset();\n+\n+  it(\"rejects click without ref before calling browserAct\", async () => {\n+    const tool = createBrowserTool();\n+    await expect(\n+      tool.execute?.(\"call-1\", {\n+        action: \"act\",\n+        request: { kind: \"click\", targetId: \"t1\" },\n+      }),\n+    ).rejects.toThrow(/requires ref.*action=snapshot/i);\n+    expect(browserActionsMocks.browserAct).not.toHaveBeenCalled();\n+  });\n+\n+  it(\"rejects type without ref before calling browserAct\", async () => {\n+    const tool = createBrowserTool();\n+    await expect(\n+      tool.execute?.(\"call-1\", {\n+        action: \"act\",\n+        request: { kind: \"type\", text: \"hello\" },\n+      }),\n+    ).rejects.toThrow(/requires ref.*action=snapshot/i);\n+    expect(browserActionsMocks.browserAct).not.toHaveBeenCalled();\n+  });\n+\n+  it(\"rejects hover without ref before calling browserAct\", async () => {\n+    const tool = createBrowserTool();\n+    await expect(\n+      tool.execute?.(\"call-1\", {\n+        action: \"act\",\n+        request: { kind: \"hover\" },\n+      }),\n+    ).rejects.toThrow(/requires ref.*action=snapshot/i);\n+    expect(browserActionsMocks.browserAct).not.toHaveBeenCalled();\n+  });\n+\n+  it(\"rejects press without key before calling browserAct\", async () => {\n+    const tool = createBrowserTool();\n+    await expect(\n+      tool.execute?.(\"call-1\", {\n+        action: \"act\",\n+        request: { kind: \"press\" },\n+      }),\n+    ).rejects.toThrow(/requires key/i);\n+    expect(browserActionsMocks.browserAct).not.toHaveBeenCalled();\n+  });\n+\n+  it(\"rejects fill without fields before calling browserAct\", async () => {\n+    const tool = createBrowserTool();\n+    await expect(\n+      tool.execute?.(\"call-1\", {\n+        action: \"act\",\n+        request: { kind: \"fill\" },\n+      }),\n+    ).rejects.toThrow(/requires fields/i);\n+    expect(browserActionsMocks.browserAct).not.toHaveBeenCalled();\n+  });\n+\n+  it(\"rejects fill with empty fields array before calling browserAct\", async () => {\n+    const tool = createBrowserTool();\n+    await expect(\n+      tool.execute?.(\"call-1\", {\n+        action: \"act\",\n+        request: { kind: \"fill\", fields: [] },\n+      }),\n+    ).rejects.toThrow(/requires fields/i);\n+    expect(browserActionsMocks.browserAct).not.toHaveBeenCalled();\n+  });\n+\n+  it(\"passes click with valid ref through to browserAct\", async () => {\n+    browserActionsMocks.browserAct.mockResolvedValueOnce({ ok: true });\n+    const tool = createBrowserTool();\n+    await tool.execute?.(\"call-1\", {\n+      action: \"act\",\n+      request: { kind: \"click\", ref: \"e12\" },\n+    });\n+    expect(browserActionsMocks.browserAct).toHaveBeenCalledTimes(1);\n+  });\n+\n+  it(\"passes press with valid key through to browserAct\", async () => {\n+    browserActionsMocks.browserAct.mockResolvedValueOnce({ ok: true });\n+    const tool = createBrowserTool();\n+    await tool.execute?.(\"call-1\", {\n+      action: \"act\",\n+      request: { kind: \"press\", key: \"Enter\" },\n+    });\n+    expect(browserActionsMocks.browserAct).toHaveBeenCalledTimes(1);\n+  });\n+\n+  it(\"passes fill with valid fields through to browserAct\", async () => {\n+    browserActionsMocks.browserAct.mockResolvedValueOnce({ ok: true });\n+    const tool = createBrowserTool();\n+    await tool.execute?.(\"call-1\", {\n+      action: \"act\",\n+      request: { kind: \"fill\", fields: [{ ref: \"e5\", value: \"test\" }] },\n+    });\n+    expect(browserActionsMocks.browserAct).toHaveBeenCalledTimes(1);\n+  });\n+\n+  it(\"passes wait (no ref required) through to browserAct\", async () => {\n+    browserActionsMocks.browserAct.mockResolvedValueOnce({ ok: true });\n+    const tool = createBrowserTool();\n+    await tool.execute?.(\"call-1\", {\n+      action: \"act\",\n+      request: { kind: \"wait\", timeMs: 500 },\n+    });\n+    expect(browserActionsMocks.browserAct).toHaveBeenCalledTimes(1);\n+  });\n+});\n"
  },
  {
    "path": "wiseflow/patches/004-web-fetch-allow-rfc2544.patch",
    "content": "diff --git a/src/agents/tools/web-fetch.ts b/src/agents/tools/web-fetch.ts\nindex f4cc88e2d..cd08b177c 100644\n--- a/src/agents/tools/web-fetch.ts\n+++ b/src/agents/tools/web-fetch.ts\n@@ -533,6 +533,8 @@ async function runWebFetch(params: WebFetchRuntimeParams): Promise<Record<string\n       url: params.url,\n       maxRedirects: params.maxRedirects,\n       timeoutSeconds: params.timeoutSeconds,\n+      // Allow RFC 2544 benchmark IPs (198.18.0.0/15) used by proxy fake-IP DNS (Clash etc.)\n+      policy: { allowRfc2544BenchmarkRange: true },\n       init: {\n         headers: {\n           Accept: \"text/markdown, text/html;q=0.9, */*;q=0.1\",\n"
  },
  {
    "path": "wiseflow/skills/browser-guide/SKILL.md",
    "content": "---\nname: browser-guide\ndescription: Best practices for using the managed browser — handling login walls, CAPTCHAs, lazy-loaded content, paywalls, and tab cleanup.\nmetadata:\n  {\n    \"openclaw\":\n      {\n        \"emoji\": \"🌐\",\n        \"always\": true,\n      },\n  }\n---\n\n# Browser Best Practices\n\nFollow these rules whenever you use the `browser` tool to interact with web pages.\n\n## 1. Login Prompts\n\nWhen a page shows a login wall (sign-in form, \"please log in\" banner, OAuth redirect, etc.):\n\n1. **Try the browser's built-in password manager first**: check whether the login form has auto-filled credentials from saved passwords. If so, use them to complete the login.\n2. If no saved credentials are available, **do NOT make up usernames or passwords, and do NOT attempt to register a new account**.\n3. Send a message to the user: _\"xx 页面需要登录，浏览器中没有预存密码，请在浏览器中完成登录或注册，完成后请通知我。\"_（xx 为页面标题）.\n4. Wait for the user to confirm.\n5. If no response arrives within **5 minutes**, assume the user is unavailable and continue with whatever content is accessible.\n\n## 2. Simple Verification / CAPTCHA\n\nWhen a page shows a one-click verification challenge (e.g., a button labelled \"去验证\", \"Verify\", \"I'm not a robot\", or a simple checkbox):\n\n1. Try clicking the verification button/checkbox directly.\n2. Wait a few seconds for the page to refresh.\n3. Take a snapshot to check whether normal content has loaded.\n4. If the page now shows the expected content, continue your task.\n\n## 3. Complex Verification Fallback\n\nIf the simple click in Step 2 above **fails** — the page still shows a challenge, the challenge is a puzzle/slider/image-selection CAPTCHA, or an error occurs:\n\n1. **Do NOT retry blindly.** Stop attempting automated verification.\n2. Send a message to the user: _\"xx 页面有验证码，我无法解决，请在浏览器中完成，完成后请通知我。\"_（xx 为页面标题）.\n3. Wait for the user to confirm.\n4. If no response arrives within **5 minutes**, continue with whatever content is accessible.\n\n## 4. Lazy-Loaded Content\n\nWhen a page uses lazy loading (infinite scroll, \"load more\" sections, content that appears only after scrolling):\n\n1. Before scrolling, assess whether the not-yet-loaded content is **relevant** to the current task.\n2. If relevant, simulate human-like scrolling: scroll down incrementally, pause briefly between scrolls to allow content to load, then take a snapshot to capture the new content.\n3. Repeat until the needed content is visible or no more new content loads.\n4. Do NOT scroll too fast, do it as a human would. After 7 times of scrolling, you should stop this turn.\n5. If not relevant, skip scrolling and work with what is already loaded.\n\n## 5. Paywall / Subscription Walls\n\nWhen a page indicates that content is behind a paywall or requires a specific subscription (e.g., \"Subscribe to continue reading\", \"Continue reading with a WSJ subscription\", premium-only banners):\n\n1. Send a message to the user describing the situation: _\"xx 页面需要订阅，请在浏览器中登录有效账号或者完成付费，完成后请通知我。\"_（xx 为页面标题）.\n2. Wait for the user to confirm.\n3. If no response arrives within **5 minutes**, continue with whatever content is accessible (summary, headline, or any visible excerpt).\n\n## 6. QR Code Login\n\nSome platforms (e.g., WeChat Official Account backend at mp.weixin.qq.com, Xiaohongshu creator center, X/Twitter) show a QR code on the login page instead of a password form. When this happens:\n\n1. Use `snapshot` to locate the QR code image element on the page.\n2. Take a screenshot scoped to the QR code area and send it to the user as an image with the message: _\"xx 页面需要扫码登录，请用手机扫描截图中的二维码完成登录，完成后请通知我。\"_（xx 为平台名称，例如\"微信公众号后台\"、\"小红书创作者中心\"、\"X\"）.\n3. After sending, poll the page every **3 seconds** using `snapshot`: check for signs of successful login such as a URL change away from the login page, disappearance of the QR code element, or appearance of a username, avatar, or dashboard element.\n4. Once a successful login is detected, resume the original task without waiting for the user to reply.\n5. If no scan occurs within **3 minutes**, send the message: _\"扫码超时，我将继续处理当前可访问的内容。\"_ and continue with whatever is accessible.\n"
  },
  {
    "path": "wiseflow/skills/rss-reader/SKILL.md",
    "content": "---\nname: rss-reader\ndescription: Discover the RSS/Atom feed URL for a website, then run the fetch-rss.mjs script to retrieve and parse articles from the feed.\nmetadata:\n  {\n    \"openclaw\":\n      {\n        \"emoji\": \"📡\",\n        \"always\": false,\n      },\n  }\n---\n\n# RSS / Atom Feed Reader\n\nUse this skill when:\n- The user wants to monitor or retrieve updates from a website\n- The user provides an RSS or Atom feed URL directly\n- You need to efficiently collect multiple articles from one source without visiting each page\n\n---\n\n## Step 1 — Discover the feed URL\n\nIf you already have an RSS/Atom URL, skip to Step 2.\n\n**Method A — page source**\nNavigate to the website, take a snapshot, and look for `<link rel=\"alternate\">` tags in `<head>`:\n```html\n<link rel=\"alternate\" type=\"application/rss+xml\" href=\"/feed\">\n<link rel=\"alternate\" type=\"application/atom+xml\" href=\"/atom.xml\">\n```\n\n**Method B — common paths** (try one at a time until one returns XML)\n```\n/feed  /feed.xml  /rss  /rss.xml  /atom.xml  /index.xml\n/?feed=rss2  /feeds/posts/default\n```\n\n**Method C** — look for RSS icons 🟠 or links labelled \"RSS\", \"Subscribe\", \"Feed\".\n\nA valid feed URL returns XML starting with `<rss`, `<feed`, or `<rdf:RDF`.\n\n---\n\n## Step 2 — Run the script\n\n```bash\nnode /path/to/wiseflow/skills/rss-reader/scripts/fetch-rss.mjs <feed_url> [--limit N] [--skip url1,url2,...]\n```\n\n| Option | Description |\n|--------|-------------|\n| `--limit N` | Max entries to return (default: 20) |\n| `--skip url1,url2,...` | Skip entries whose URLs are already processed (deduplication) |\n\n**Output** is markdown with two sections:\n1. **Full-content articles** — entries where the feed includes the complete article body (>200 chars). Process these directly; **no need to visit the article URL**.\n2. **Summary-only links** — entries with only a short snippet. Visit each URL to retrieve the full content.\n\n---\n\n## Step 3 — Handle results\n\n- For full-content articles: extract title, author, date, and content directly from the script output.\n- For summary-only links: use `browser.navigate(url)` to fetch each article page.\n- Pass the script output directly to the user or to your processing pipeline.\n\n---\n\n## Edge cases\n\n| Situation | Action |\n|-----------|--------|\n| Feed returns 404 | Try alternative paths from Step 1 |\n| Feed requires login | Follow the **browser-guide** skill |\n| Script error \"Failed to parse feed\" | Feed XML may be malformed; report the URL to the user |\n| Empty feed | Report: \"This RSS feed has no entries.\" |\n"
  },
  {
    "path": "wiseflow/skills/rss-reader/package.json",
    "content": "{\n  \"name\": \"rss-reader-skill\",\n  \"version\": \"1.0.0\",\n  \"description\": \"RSS/Atom feed reader skill for wiseflow\",\n  \"type\": \"module\",\n  \"dependencies\": {\n    \"rss-parser\": \"^3.13.0\"\n  }\n}\n"
  },
  {
    "path": "wiseflow/skills/rss-reader/scripts/fetch-rss.mjs",
    "content": "#!/usr/bin/env node\n/**\n * fetch-rss.mjs — Fetch and parse an RSS/Atom feed, output as markdown\n *\n * Usage:\n *   node fetch-rss.mjs <url> [--limit N] [--skip url1,url2,...]\n *\n * Output (stdout): markdown text ready for the LLM to read.\n *   - Entries with full content (>200 chars): included inline as article blocks.\n *   - Entries with only short snippets: listed as links for later fetching.\n */\n\nimport { createRequire } from \"node:module\";\nimport { URL } from \"node:url\";\n\nconst require = createRequire(import.meta.url);\n\nlet Parser;\ntry {\n  Parser = require(\"rss-parser\");\n} catch {\n  console.error(\"Error: 'rss-parser' not found. Run: npm install rss-parser\");\n  process.exit(1);\n}\n\n// ── Argument parsing ──────────────────────────────────────────────────────────\nconst args = process.argv.slice(2);\nif (!args.length || args[0] === \"--help\" || args[0] === \"-h\") {\n  console.log(\"Usage: node fetch-rss.mjs <url> [--limit N] [--skip url1,url2,...]\\n\");\n  process.exit(0);\n}\n\nconst feedUrl = args[0];\nlet limit = 20;\nlet skipUrls = new Set();\n\nfor (let i = 1; i < args.length; i++) {\n  if (args[i] === \"--limit\" && args[i + 1]) limit = parseInt(args[++i], 10) || 20;\n  if (args[i] === \"--skip\" && args[i + 1])\n    skipUrls = new Set(args[++i].split(\",\").map((u) => u.trim()).filter(Boolean));\n}\n\ntry {\n  new URL(feedUrl);\n} catch {\n  console.error(`Error: \"${feedUrl}\" is not a valid URL.`);\n  process.exit(1);\n}\n\n// ── Fetch ─────────────────────────────────────────────────────────────────────\nconst parser = new Parser({\n  customFields: {\n    item: [\n      [\"content:encoded\", \"contentEncoded\"],\n      [\"dc:creator\", \"dcCreator\"],\n    ],\n  },\n  timeout: 15000,\n  headers: { \"User-Agent\": \"rss-reader-skill/1.0\" },\n});\n\nlet feed;\ntry {\n  feed = await parser.parseURL(feedUrl);\n} catch (err) {\n  const msg = String(err.message || err);\n  if (msg.match(/40[134]/))\n    console.error(`Error: Feed requires authentication or was not found — ${feedUrl}`);\n  else if (msg.match(/ENOTFOUND|ETIMEDOUT|ECONNREFUSED/))\n    console.error(`Error: Network error — ${msg}`);\n  else\n    console.error(`Error: Failed to parse feed — ${msg}`);\n  process.exit(1);\n}\n\n// ── Process entries ───────────────────────────────────────────────────────────\nconst fullArticles = [];   // entries with substantial content (>200 chars)\nconst linkOnlyItems = [];  // entries with only a short snippet or no content\n\nlet count = 0;\nfor (const item of feed.items) {\n  if (count >= limit) break;\n\n  const url = item.link || item.guid || \"\";\n  if (!url || skipUrls.has(url)) continue;\n\n  // Content priority aligned with rss_parsor.py:\n  // content:encoded > content (feedparser's content list) > summary > description\n  const rawContent =\n    item.contentEncoded || item.content || item.summary || item.description || \"\";\n  const author = item.dcCreator || item.creator || item.author || \"\";\n  const title = item.title || \"(no title)\";\n  const publishDate = item.isoDate || item.pubDate || item.published || \"\";\n  const dateStr = publishDate ? publishDate.slice(0, 10) : \"\";\n\n  if (rawContent.length > 200) {\n    fullArticles.push({ title, url, author, dateStr, content: rawContent });\n  } else if (rawContent.length > 50) {\n    // Short snippet — needs original page visit\n    linkOnlyItems.push({ title, url, author, dateStr, snippet: rawContent });\n  } else {\n    // No usable content — link only\n    linkOnlyItems.push({ title, url, author, dateStr, snippet: \"\" });\n  }\n  count++;\n}\n\n// ── Output ────────────────────────────────────────────────────────────────────\nconst feedTitle = feed.title || feedUrl;\nconst feedDesc = feed.description || feed.subtitle || \"\";\n\nconst lines = [];\nlines.push(`## Feed: ${feedTitle}`);\nif (feedDesc) lines.push(`> ${feedDesc}`);\nlines.push(`Source: ${feedUrl}`);\nlines.push(`Retrieved: ${new Date().toISOString().slice(0, 10)} | Total in feed: ${feed.items.length} | Returned: ${count}`);\nlines.push(\"\");\n\nif (fullArticles.length > 0) {\n  for (const a of fullArticles) {\n    lines.push(\"---\");\n    lines.push(\"\");\n    lines.push(`### ${a.title}`);\n    lines.push(`URL: ${a.url}`);\n    const meta = [a.author && `Author: ${a.author}`, a.dateStr && `Date: ${a.dateStr}`]\n      .filter(Boolean)\n      .join(\" | \");\n    if (meta) lines.push(meta);\n    lines.push(\"\");\n    lines.push(a.content);\n    lines.push(\"\");\n  }\n  lines.push(\"---\");\n  lines.push(\"\");\n}\n\nif (linkOnlyItems.length > 0) {\n  lines.push(\"## Articles with summary only — visit URL for full content:\");\n  lines.push(\"\");\n  let idx = 1;\n  for (const l of linkOnlyItems) {\n    const meta = [l.author && `Author: ${l.author}`, l.dateStr && `Date: ${l.dateStr}`]\n      .filter(Boolean)\n      .join(\", \");\n    const snippetPart = l.snippet ? ` — ${l.snippet}` : \"\";\n    lines.push(`* [[${idx}] ${l.title}](${l.url})${snippetPart}${meta ? ` (${meta})` : \"\"}`);\n    idx++;\n  }\n}\n\nconsole.log(lines.join(\"\\n\"));\n"
  },
  {
    "path": "wiseflow/skills/smart-search/SKILL.md",
    "content": "---\nname: smart-search\ndescription: Construct optimized search URLs for major platforms and navigate to results with the browser. Replaces the built-in web_search tool for targeted, platform-specific searches.\nmetadata:\n  {\n    \"openclaw\":\n      {\n        \"emoji\": \"🔍\",\n        \"always\": true,\n      },\n  }\n---\n\n# Smart Search Guide\n\nUse this skill whenever the user asks you to search for information on the web or a specific platform. **Construct the search URL directly and navigate to it** instead of using the built-in `web_search` tool.\n\n## Keyword Encoding\n\n- **Spaces**: use `+` for Bing, GitHub, Bilibili; use `%20` for Douyin, Twitter, Facebook, Zhihu; either works for Baidu, Quark, YouTube\n- **Special characters**: URL-encode them (e.g., `#` → `%23`, `&` → `%26`, `?` → `%3F`)\n- **Chinese characters**: URL-encode (browsers handle this automatically when you navigate)\n\n---\n\n## Cookie Warmup — CRITICAL for Authenticated Platforms\n\nMany platforms will return empty results or redirect to login if you navigate **directly** to a search URL without first visiting the home page. Always warm up the session in two steps:\n\n| Platform | Step 1 (warmup) | Step 2 (search) |\n|----------|-----------------|-----------------|\n| 知乎 | Navigate `https://www.zhihu.com` | Navigate to search URL |\n| Reddit | Navigate `https://www.reddit.com` | Navigate to search URL |\n| 微博 | Navigate `https://weibo.com` | Navigate to search URL |\n| YouTube | Navigate `https://www.youtube.com` | Navigate to search URL |\n| 雪球 | Navigate `https://xueqiu.com` | Navigate to search URL |\n| 路透社 | Navigate `https://www.reuters.com` | Navigate to search URL |\n| Bilibili | Navigate `https://www.bilibili.com` | Navigate to search URL |\n| 小红书 | Navigate `https://www.xiaohongshu.com` | Navigate to search URL |\n\n**Platforms that do NOT need warmup** (public APIs / no auth required):\n- Bing, Baidu, Quark, GitHub, arXiv, Wikipedia, BBC, HackerNews, V2EX\n\n---\n\n## General Web Search\n\n### Bing (recommended for English and international content)\n\n```\nhttps://www.bing.com/search?q={keyword}\n```\n\nTime filters (append to URL):\n- Last 24 hours: `&filters=ex1:\"ez1\"`\n- Last week: `&filters=ex1:\"ez2\"`\n- Last month: `&filters=ex1:\"ez3\"`\n\nPagination: `&first={offset}` where offset = (page − 1) × 10 + 1\n\n### Bing News\n\n```\nhttps://www.bing.com/news/search?q={keyword}\n```\n\n### Bing Images\n\n```\nhttps://www.bing.com/images/search?q={keyword}\n```\n\nTime filters for images: `&qft=filterui:age-lt{minutes}` where minutes = 1440 (day) / 10080 (week) / 44640 (month) / 525600 (year)\n\n### Baidu (recommended for Chinese content)\n\nGeneral web search:\n```\nhttps://www.baidu.com/s?wd={keyword}\n```\n\nBaidu Images:\n```\nhttps://image.baidu.com/search/index?tn=baiduimage&fm=result&ie=utf-8&word={keyword}\n```\n\n### Quark / 夸克 (recommended for Chinese news and mobile content)\n\n```\nhttps://quark.sm.cn/s?q={keyword}\n```\n\n---\n\n## Academic & Reference\n\n### arXiv (preprints: CS, Physics, Math, Biology, Economics, etc.)\n\nBrowser search:\n```\nhttps://arxiv.org/search/?searchtype=all&query={keyword}\n```\n\nSearch by specific field:\n- Title: `?searchtype=ti&query={keyword}`\n- Author: `?searchtype=au&query={keyword}`\n- Abstract: `?searchtype=abs&query={keyword}`\n- Category (e.g., cs.AI): `?searchtype=all&query={keyword}&searchtype=all&start=0`\n\nSort by most recent: append `&order=-announced_date_first`\n\narXiv API (returns structured XML — useful for programmatic access):\n```\nhttps://export.arxiv.org/api/query?search_query=all:{keyword}&max_results=10\n```\n\n### Wikipedia\n\nEnglish:\n```\nhttps://en.wikipedia.org/w/index.php?search={keyword}\n```\n\nChinese (中文):\n```\nhttps://zh.wikipedia.org/w/index.php?search={keyword}\n```\n\nOther languages: replace the language code prefix (e.g., `de`, `fr`, `ja`, `ko`, `es`).\n\n---\n\n## Video Platforms\n\n### YouTube\n\n```\nhttps://www.youtube.com/results?search_query={keyword}\n```\n\nTime filters (append to URL):\n- Today: `&sp=EgIIAg%3D%3D`\n- This week: `&sp=EgIIAw%3D%3D`\n- This month: `&sp=EgIIBA%3D%3D`\n- This year: `&sp=EgIIBQ%3D%3D`\n\nSort options:\n- By upload date: `&sp=CAISAhAB`\n- By view count: `&sp=CAMSAhAB`\n- By rating: `&sp=CAESAhAB`\n\nMulti-keyword: join with `+` (e.g., `wiseflow+AI+搜索`)\n\n---\n\n## Chinese Social Media\n\n### 哔哩哔哩 (Bilibili / B站)\n\n```\nhttps://search.bilibili.com/{channel}?keyword={keyword}\n```\n\nChannels: `all` (综合) | `video` (视频) | `bangumi` (番剧) | `pgc` (影视) | `live` (直播) | `article` (专栏) | `upuser` (UP主)\n\nSort options for `all` and `video`:\n- Most views: `&order=click`\n- Newest: `&order=pubdate`\n- Most danmaku: `&order=dm`\n- Most favorites: `&order=stow`\n\nSort options for `live`:\n- Search anchors only: `&search_type=live_user`\n- Search live rooms only: `&search_type=live_room`\n- Live rooms by start time: `&search_type=live_room&order=live_time`\n\nSort options for `upuser`:\n- Most fans (desc): `&order=fans`\n- Fewest fans (asc): `&order=fans&order_sort=1`\n- Highest level: `&order=level`\n\nSort options for `article`:\n- Newest: `&order=pubdate`\n- Most clicks: `&order=click`\n- Most popular: `&order=attention`\n- Most comments: `&order=scores`\n\nMulti-keyword: join with `+`\n\n### 抖音 (Douyin / TikTok China)\n\n```\nhttps://www.douyin.com/search/{keyword}?type={type}\n```\n\nTypes: `general` (综合，default) | `video` (视频) | `user` (用户) | `live` (直播)\n\nMulti-keyword: join with `%20` (e.g., `wiseflow%20AI`)\n\nFor sort and filter options: interact with the page UI after navigating.\n\n### 微博 (Weibo)\n\n- Comprehensive: `https://s.weibo.com/weibo/{keyword}`\n- Real-time / Latest: `https://s.weibo.com/realtime?q={keyword}`\n- Users: `https://s.weibo.com/user?q={keyword}`\n- Topics: `https://s.weibo.com/topic?q={keyword}`\n\n### 小红书 (Xiaohongshu / RED / 红薯)\n\n```\nhttps://www.xiaohongshu.com/search_result?keyword={keyword}&source=web_search_result_notes\n```\n\n> **Note**: Use `source=web_search_result_notes` (not `web_explore_feed`) to get search results instead of explore feed.\n> After navigating, wait ~3 seconds and scroll down twice — results use lazy loading.\n\nFor channel selection, filter, and sort: interact with the page UI after navigating.\n\n### 知乎 (Zhihu)\n\n```\nhttps://www.zhihu.com/search?type=content&q={keyword}\n```\n\nContent types: `content` (综合) | `people` (用户) | `scholar` (论文) | `column` (专栏) | `publication` (电子书) | `ring` (圈子) | `topic` (话题) | `zvideo` (视频)\n\nFilters for comprehensive search (`type=content`):\n- Answers only: `&vertical=answer`\n- Articles only: `&vertical=article`\n- Videos only: `&vertical=zvideo`\n\nSort:\n- Most upvotes: `&sort=upvoted_count`\n- Newest: `&sort=created_time`\n\nTime range:\n- Last day: `&time_interval=a_day`\n- Last week: `&time_interval=a_week`\n- Last month: `&time_interval=a_month`\n- Last 3 months: `&time_interval=three_months`\n- Last 6 months: `&time_interval=half_a_year`\n- Last year: `&time_interval=a_year`\n\nExample — newest articles from the last month:\n```\nhttps://www.zhihu.com/search?type=content&q={keyword}&vertical=article&sort=created_time&time_interval=a_month\n```\n\nMulti-keyword: join with `%20`\n\n---\n\n## International Social Media\n\n### Twitter / X\n\n- Top results: `https://x.com/search?q={keyword}`\n- Latest: `https://x.com/search?q={keyword}&f=live`\n- People: `https://x.com/search?q={keyword}&f=user`\n- Media: `https://x.com/search?q={keyword}&f=media`\n- Lists: `https://x.com/search?q={keyword}&f=list`\n\nAdd \"Near You\" filter: append `&lf=on`\n\n> **Note**: Twitter/X uses heavy client-side rendering. After navigating, wait at least **5 seconds** before taking a snapshot to ensure tweet content has loaded.\n\nMulti-keyword: join with `%20`\n\n### Facebook\n\n- All: `https://www.facebook.com/search/top/?q={keyword}`\n- People: `https://www.facebook.com/search/people/?q={keyword}`\n- Pages: `https://www.facebook.com/search/pages?q={keyword}`\n- Groups: `https://www.facebook.com/search/groups?q={keyword}`\n- Events: `https://www.facebook.com/search/events?q={keyword}`\n\nFor filter and sort options: interact with the page UI after navigating.\n\nMulti-keyword: join with `%20`\n\n### Reddit\n\n```\nhttps://www.reddit.com/search/?q={keyword}\n```\n\nSort options: `&sort=relevance` | `hot` | `top` | `new` | `comments`\n\nTime filter (for `sort=top`): `&t=hour` | `day` | `week` | `month` | `year` | `all`\n\nSearch within a specific subreddit:\n```\nhttps://www.reddit.com/r/{subreddit}/search/?q={keyword}&restrict_sr=on&sort=relevance&t=all\n```\n\nMulti-keyword: join with `+`\n\n---\n\n## Developer Platforms\n\n### GitHub\n\n```\nhttps://github.com/search?q={keyword}&type={type}\n```\n\nTypes: `repositories` | `users` | `code` | `issues` | `pullrequests` | `discussions` | `topics` | `wikis`\n\nSort options for **repositories**:\n- Most stars: `&s=stars&o=desc`\n- Fewest stars: `&s=stars&o=asc`\n- Most forks: `&s=forks&o=desc`\n- Recently updated: `&s=updated&o=desc`\n\nSort options for **users**:\n- Most followers: `&s=followers&o=desc`\n- Most repositories: `&s=repositories&o=desc`\n- Recently joined: `&s=joined&o=desc`\n\nLanguage filter (for `repositories` and `users`): `&l={language}` (e.g., `&l=Python`, `&l=TypeScript`, `&l=Go`)\n\nMulti-keyword: join with `+`\n\nExample:\n```\nhttps://github.com/search?q=wiseflow+addon&type=repositories&s=stars&o=desc&l=Python\n```\n\n### LinkedIn\n\nJob search (cookie warmup required — navigate `https://www.linkedin.com` first):\n```\nhttps://www.linkedin.com/jobs/search/?keywords={keyword}&location={location}\n```\n\nPeople search:\n```\nhttps://www.linkedin.com/search/results/people/?keywords={keyword}\n```\n\nCompany search:\n```\nhttps://www.linkedin.com/search/results/companies/?keywords={keyword}\n```\n\nMulti-keyword: join with `%20`\n\n---\n\n## After Navigating\n\n1. Take a snapshot to confirm results have loaded.\n2. If a CAPTCHA, login wall, or verification challenge appears, follow the **browser-guide** skill.\n3. Extract the relevant information from the visible results.\n4. If more results are needed, paginate by:\n   - Modifying the URL's pagination parameter, or\n   - Clicking the \"Next page\" button on the page.\n5. Close the tab immediately after extracting all needed information.\n\n---\n\n## Financial Platforms\n\n### 雪球 (Xueqiu) — Stocks & Finance\n\nStock/symbol search (cookie warmup required — navigate `https://xueqiu.com` first):\n```\nhttps://xueqiu.com/search?q={keyword}\n```\n\nExample queries: `茅台`, `AAPL`, `腾讯`, `SH600519`\n\nFor stock detail page: `https://xueqiu.com/S/{symbol}` (e.g., `/S/SH600519`)\n\n---\n\n## Tech Communities\n\n### Hacker News (public, no login required)\n\n```\nhttps://news.ycombinator.com/\n```\n\nSearch via Algolia (unofficial but reliable):\n```\nhttps://hn.algolia.com/?q={keyword}\n```\n\nSort by date: `&dateRange=pastWeek` | `pastMonth` | `pastYear`\n\n### V2EX (public, no login required)\n\nSearch (Google site search approach, most reliable):\n```\nhttps://www.google.com/search?q=site:v2ex.com+{keyword}\n```\n\nOr navigate directly to V2EX and use the built-in search:\n```\nhttps://www.v2ex.com/?q={keyword}\n```\n\n---\n\n## News\n\n### Reuters\n\nNews search (cookie warmup required — navigate `https://www.reuters.com` first):\n```\nhttps://www.reuters.com/search/news?blob={keyword}\n```\n\nMulti-keyword: join with `+`\n"
  }
]