[
  {
    "path": ".dockerignore",
    "content": ".env\n.venv\ndata\ndownloads\nlogs"
  },
  {
    "path": ".github/workflows/docker-image.yml",
    "content": "name: Docker Image CI\n\non:\n  release:\n    types: [ published ]\n\njobs:\n  build:\n    runs-on: ubuntu-latest\n    permissions:\n      contents: read\n      packages: write\n    steps:\n      - uses: actions/checkout@v4\n\n      - uses: docker/login-action@v3\n        with:\n          registry: ghcr.io\n          username: ${{ github.actor }}\n          password: ${{ secrets.GITHUB_TOKEN }}\n\n      - name: 构建&推送镜像\n        run: |\n          # 获取release标签版本\n          VERSION=${GITHUB_REF#refs/tags/}\n          \n          # 构建并推送带版本号的镜像\n          docker build . --file Dockerfile \\\n            --tag ghcr.io/z-mio/parse_hub_bot:${VERSION} \\\n            --tag ghcr.io/z-mio/parse_hub_bot:latest\n          \n          docker push ghcr.io/z-mio/parse_hub_bot:${VERSION}\n          docker push ghcr.io/z-mio/parse_hub_bot:latest"
  },
  {
    "path": ".gitignore",
    "content": "/.venv\n/logs\n/.idea\n/downloads\n.env\n*.session\n/data"
  },
  {
    "path": "Dockerfile",
    "content": "FROM python:3.12-slim AS build\n\nCOPY --from=ghcr.io/astral-sh/uv:0.10.11 /uv /uvx /bin/\n\nWORKDIR /app\n\nENV UV_COMPILE_BYTECODE=1 \\\n    UV_LINK_MODE=copy\n\nCOPY pyproject.toml uv.lock ./\n\nRUN apt-get update && apt-get install -y --no-install-recommends \\\n        gcc python3-dev \\\n    && rm -rf /var/lib/apt/lists/*\n\nRUN --mount=type=cache,target=/root/.cache/uv \\\n    uv sync --no-install-project --frozen\n\nCOPY . .\nRUN --mount=type=cache,target=/root/.cache/uv \\\n    uv sync --frozen\n\nFROM python:3.12-slim AS runtime\n\nRUN apt-get update && apt-get install -y --no-install-recommends \\\n        libglib2.0-0 \\\n        ffmpeg \\\n        media-types \\\n        curl unzip ca-certificates \\\n    && curl -fsSL https://deno.land/install.sh | sh \\\n    && rm -rf /var/lib/apt/lists/*\n\nENV DENO_INSTALL=\"/root/.deno\"\nENV PATH=\"/app/.venv/bin:$DENO_INSTALL/bin:$PATH\"\n\nWORKDIR /app\nCOPY --from=build /app /app\n\nENV PATH=\"/app/.venv/bin:$PATH\"\n\n\nCMD [\"python\", \"bot.py\"]\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2024 梓澪\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "<div align=\"center\">\n\n# 🔗 ParseHubBot\n\n**Telegram 多平台聚合解析机器人**\n\n<p align=\"center\">\n  <a href=\"https://github.com/z-mio/Parse_Hub_Bot/blob/main/LICENSE\">\n    <img src=\"https://img.shields.io/github/license/z-mio/Parse_Hub_Bot?style=flat-square&color=5D6D7E\" alt=\"License\">\n  </a>\n  <a href=\"https://www.python.org/\">\n    <img src=\"https://img.shields.io/badge/Python-3.12+-blue?style=flat-square&logo=python&logoColor=white\" alt=\"Python\">\n  </a>\n  <a href=\"https://t.me/ParseHubot\">\n    <img src=\"https://img.shields.io/badge/Telegram-Bot-2CA5E0?style=flat-square&logo=telegram&logoColor=white\" alt=\"Telegram Bot\">\n  </a>\n  <a href=\"https://github.com/astral-sh/uv\">\n    <img src=\"https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json&style=flat-square\" alt=\"uv\">\n  </a>\n</p>\n\n[**🤖 实例演示**](https://t.me/ParseHubot) ·\n[**📚 相关项目**](https://github.com/z-mio/ParseHub) ·\n[**🐛 问题反馈**](https://github.com/z-mio/Parse_Hub_Bot/issues)\n\n</div>\n\n---\n\n> 官方实例：[@ParseHubot](https://t.me/ParseHubot)\n\n## ✨ 功能特性\n\n- 🎬 **多平台解析** — 抖音、B站、YouTube、小红书、Twitter 等 16+ 主流平台一站搞定\n- ⚡ **内联模式** — 在任意聊天窗口输入 `@BotUsername <链接>` 即可解析\n- 🖼️ **Tg 兼容** — 自动转码、长图切割、大视频分段\n- 📦 **多种模式** — 在线预览, 原始文件, 打包下载\n- 🐳 **Docker 部署** — 开箱即用\n\n## 📦 支持平台一览\n\n| 平台              | 视频 | 图文 |  其他   |\n|:----------------|:--:|:--:|:-----:|\n| **Twitter / X** | ✅  | ✅  | 📝 文章 |\n| **Instagram**   | ✅  | ✅  |       |\n| **YouTube**     | ✅  |    | 🎵 音乐 |\n| **Facebook**    | ✅  |    |       |\n| **Threads**     | ✅  | ✅  |       |\n| **Bilibili**    | ✅  |    | 📝 动态 |\n| **抖音**          | ✅  | ✅  |       |\n| **TikTok**      | ✅  | ✅  |       |\n| **微博**          | ✅  | ✅  |       |\n| **小红书**         | ✅  | ✅  |       |\n| **贴吧**          | ✅  | ✅  |       |\n| **微信公众号**       |    | ✅  |       |\n| **快手**          | ✅  |    |       |\n| **酷安**          | ✅  | ✅  |       |\n| **皮皮虾**         | ✅  | ✅  |       |\n| **最右**          | ✅  | ✅  |       |\n| **小黑盒**         | ✅  | ✅  |       |\n\n> 🔧 更多平台持续接入中...\n\n## 🚀 快速开始\n\n### 🐳 Docker 运行 (推荐)\n\n```bash\nmkdir parse_hub_bot && cd parse_hub_bot\n\ndocker run -d \\\n  --restart=always \\\n  -e API_ID=你的API_ID \\\n  -e API_HASH=你的API_HASH \\\n  -e BOT_TOKEN=你的BOT_TOKEN \\\n  -v ./logs:/app/logs \\\n  -v ./data:/app/data \\\n  --name parse-hub-bot \\\n  ghcr.io/z-mio/parse_hub_bot:latest\n```\n\n### 💻 源码运行\n\n```bash\nuv sync\nuv run bot.py\n```\n\n---\n\n## ⚙️ 配置说明\n\n- **环境变量:** 基础配置\n- **平台配置 (可选):** 平台代理和 Cookie\n\n### 📝 环境变量\n\n```dotenv\n# ✅ 必填\nAPI_ID=        # Telegram API ID，登录 https://my.telegram.org 获取\nAPI_HASH=      # Telegram API Hash，同上获取\nBOT_TOKEN=     # 机器人 Token，向 @BotFather 申请\n\n# 🔲 可选\nBOT_PROXY=     # Bot 连接 TG 使用的代理，例：http://127.0.0.1:7890\n```\n\n### 🌐 平台配置\n\n用于为各解析平台单独配置**代理**和 **Cookie**，位于 `data/config/platform_config.yaml`\n\n```yaml\n# ═══════════════════════ 全局默认代理 ═══════════════════════\n# 当某平台未单独配置代理时，会使用全局默认代理\n# 支持填写单个地址(字符串)或多个地址(列表，随机选取)\n\ndefault_parser_proxies: http://127.0.0.1:7890        # 解析代理（单个）\ndefault_downloader_proxies: # 下载代理（代理池）\n  - http://127.0.0.1:7890\n  - http://127.0.0.1:7891\n\n# ═══════════════════════ 平台独立配置 ═══════════════════════\nplatforms:\n  <platform_id>: # 平台 ID，见下方支持列表\n    disable_parser_proxy: false          # 是否禁用解析代理（直连）\n    disable_downloader_proxy: false      # 是否禁用下载代理（直连）\n    parser_proxies: # 该平台专用解析代理池\n      - http://proxy1:port\n    downloader_proxies: # 该平台专用下载代理池\n      - http://proxy2:port\n    cookies: # 该平台 Cookie 列表（随机选取）\n      - \"cookie_string_1\"\n      - \"cookie_string_2\"\n```\n\n### 🔀 代理优先级\n\n解析代理和下载代理各自遵循相同的优先级逻辑：\n\n```\n禁用代理 (disable_*_proxy: true)\n  ↓ 未禁用\n平台专用代理 (parser_proxies / downloader_proxies)\n  ↓ 未配置\n全局默认代理 (default_parser_proxies / default_downloader_proxies)\n  ↓ 未配置\n直连（不使用代理）\n```\n\n> 💡 当代理池中有多个地址时，每次请求会**随机选取**一个\n\n### 🔑 支持的平台 ID\n\n`<platform_id>` 必须是以下合法的平台 ID：\n\n| 平台 ID       | 对应平台        |\n|:------------|:------------|\n| `twitter`   | Twitter / X |\n| `instagram` | Instagram   |\n| `youtube`   | YouTube     |\n| `facebook`  | Facebook    |\n| `threads`   | Threads     |\n| `bilibili`  | 哔哩哔哩        |\n| `douyin`    | 抖音          |\n| `tiktok`    | TikTok      |\n| `weibo`     | 微博          |\n| `xhs`       | 小红书         |\n| `tieba`     | 百度贴吧        |\n| `wechat`    | 微信公众号       |\n| `kuaishou`  | 快手          |\n| `coolapk`   | 酷安          |\n| `pipixia`   | 皮皮虾         |\n| `zuiyou`    | 最右          |\n| `xiaoheihe` | 小黑盒         |\n\n### 🍪 支持 Cookie 的平台\n\n- `Twitter / X`\n- `Instagram`\n- `YouTube`\n- `Bilibili`\n- `抖音`\n- `TikTok`\n- `快手`\n- `小红书`\n\n### 📌 配置示例\n\n##### 示例 1：国内平台直连，海外平台走代理\n\n```yaml\ndefault_parser_proxies: http://127.0.0.1:7890\ndefault_downloader_proxies: http://127.0.0.1:7890\n\nplatforms:\n  bilibili:\n    disable_parser_proxy: true\n    disable_downloader_proxy: true\n  douyin:\n    disable_parser_proxy: true\n    disable_downloader_proxy: true\n  xhs:\n    disable_parser_proxy: true\n    disable_downloader_proxy: true\n```\n\n#### 示例 2：Twitter 配置 Cookie + 使用全局代理\n\n```yaml\ndefault_parser_proxies: http://127.0.0.1:7890\ndefault_downloader_proxies: http://127.0.0.1:7890\n\nplatforms:\n  twitter:\n    cookies:\n      - \"auth_token=your_token_here; ct0=your_ct0_here\"\n```\n\n#### 示例 3：YouTube 使用独立代理池\n\n```yaml\nplatforms:\n  youtube:\n    parser_proxies:\n      - http://proxy-us-1:8080\n      - http://proxy-us-2:8080\n      - http://proxy-eu-1:8080\n    downloader_proxies:\n      - http://proxy-us-1:8080\n      - http://proxy-eu-1:8080\n```\n\n#### 示例 4：B站指定 Cookie 轮换 + 解析直连 + 下载走代理\n\n```yaml\nplatforms:\n  bilibili:\n    disable_parser_proxy: true\n    downloader_proxies:\n      - http://127.0.0.1:7890\n    cookies:\n      - \"SESSDATA=xxx; bili_jct=xxx; buvid3=xxx\"\n      - \"SESSDATA=yyy; bili_jct=yyy; buvid3=yyy\"\n```\n\n## 🌟 Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=z-mio/Parse_Hub_Bot&type=Date)](https://star-history.com/#z-mio/Parse_Hub_Bot&Date)\n\n## 🤝 参与贡献\n\n欢迎提交 Pull Request 或 Issue！\n\n- 核心解析相关请前往 [ParseHub](https://github.com/z-mio/ParseHub)。\n- Bug 反馈请附上相关 URL 和日志信息。\n\n## 📄 开源协议\n\n本项目基于 [MIT License](LICENSE) 协议开源。\n\n---\n\n<div align=\"center\">\n\n**如果这个项目对你有帮助，欢迎点个 ⭐ Star！**\n\n</div>\n\n"
  },
  {
    "path": "bot.py",
    "content": "import asyncio\nimport shutil\nfrom typing import Any\n\nimport pillow_heif\nfrom pyrogram import Client\nfrom pyrogram.handlers import ConnectHandler, DisconnectHandler\nfrom pyrogram.types import BotCommand\n\nfrom core import bs, on_connect, on_disconnect, ws\nfrom log import logger, setup_logging\nfrom services import parse_cache, persistent_cache\nfrom utils.event_loop import setup_optimized_event_loop\n\npillow_heif.register_heif_opener()\n\nsetup_logging(debug=bs.debug)\n\nloop = asyncio.new_event_loop()\nasyncio.set_event_loop(loop)\n\nsetup_optimized_event_loop()\n\n\nclass Bot(Client):\n    def __init__(self) -> None:\n        self.cfg = bs\n\n        super().__init__(\n            f\"{self.cfg.bot_token.split(':')[0]}_bot\",\n            api_id=self.cfg.api_id,\n            api_hash=self.cfg.api_hash,\n            bot_token=self.cfg.bot_token,\n            plugins={\"root\": \"plugins\"},\n            proxy=self.cfg.bot_proxy,\n            loop=loop,\n            workdir=self.cfg.sessions_path,\n        )\n\n    async def start(self, *args: Any, **kwargs: Any) -> \"Bot\":\n        self.init_watchdog()\n        parse_cache.start_cleanup()\n        persistent_cache.start_cleanup()\n        await super().start()\n        await self.set_menu()\n        return self\n\n    async def stop(self, *args: Any, **kwargs: Any) -> None:\n        ws.exit_flag = True\n        await persistent_cache.close()\n        await super().stop()\n        # 结束时清理下载残留\n        if self.cfg.download_dir.exists():\n            shutil.rmtree(self.cfg.download_dir)\n\n    def init_watchdog(self) -> None:\n        self.add_handler(ConnectHandler(on_connect))\n        self.add_handler(DisconnectHandler(on_disconnect))\n\n    async def set_menu(self) -> None:\n        commands = {\n            \"start\": \"开始\",\n            \"jx\": \"解析\",\n            \"raw\": \"不处理媒体, 发送原始文件\",\n            \"zip\": \"不处理媒体, 保存解析结果, 发送压缩包\",\n        }\n        await self.set_bot_commands([BotCommand(command=k, description=v) for k, v in commands.items()])\n        logger.debug(f\"菜单已设置: {commands}\")\n\n\nif __name__ == \"__main__\":\n    bot = Bot()\n    bot.run()\n"
  },
  {
    "path": "core/__init__.py",
    "content": "from .config import bs, ws\nfrom .platform_config import pl_cfg\nfrom .watchdog import on_connect, on_disconnect\n\n__all__ = [\n    \"bs\",\n    \"ws\",\n    \"pl_cfg\",\n    \"on_connect\",\n    \"on_disconnect\",\n]\n"
  },
  {
    "path": "core/config.py",
    "content": "import os\nfrom pathlib import Path\nfrom typing import Any\nfrom urllib.parse import urlparse\n\nfrom dotenv import load_dotenv\nfrom pydantic import Field, field_validator, model_validator\nfrom pydantic_settings import BaseSettings, SettingsConfigDict\n\nload_dotenv()\n\n\nclass BotSettings(BaseSettings):\n    model_config = SettingsConfigDict(\n        env_file=\".env\",\n        env_file_encoding=\"utf-8\",\n        extra=\"ignore\",\n    )\n\n    bot_token: str = Field(...)\n    api_id: str = Field(...)\n    api_hash: str = Field(...)\n    bot_proxy: dict | None = Field(default=None)\n    data_path: Path = Path(\"data\")\n    cache_time: int = Field(default=14 * 24 * 60, ge=0, description=\"缓存时间, 单位分钟, 0 为禁用\")\n    cache_max_entries: int = Field(default=30000, ge=0, description=\"缓存最大条数, 0 为不限制\")\n    cache_save_interval: int = Field(default=5, gt=0, description=\"缓存保存间隔, 单位分钟\")\n    cache_cleanup_interval: int = Field(default=60, gt=0, description=\"缓存过期清理间隔, 单位分钟\")\n    download_dir: Path = Path(\"downloads\")\n    debug: bool = Field(default=False)\n    debug_skip_cleanup: bool = Field(default=False, description=\"跳过资源清理\")\n\n    @model_validator(mode=\"after\")\n    def cache_config_validate(self) -> \"BotSettings\":\n        if self.cache_time and self.cache_cleanup_interval > self.cache_time:\n            raise ValueError(\"CACHE_CLEANUP_INTERVAL 不能大于 CACHE_TIME\")\n        return self\n\n    def model_post_init(self, __context: Any) -> None:\n        \"\"\"模型初始化后的操作\"\"\"\n        self.sessions_path.mkdir(parents=True, exist_ok=True)\n        self.cache_path.mkdir(parents=True, exist_ok=True)\n        self.config_path.mkdir(parents=True, exist_ok=True)\n\n    @property\n    def sessions_path(self) -> Path:\n        return self.data_path / \"sessions\"\n\n    @property\n    def cache_path(self) -> Path:\n        return self.data_path / \"cache\"\n\n    @property\n    def config_path(self) -> Path:\n        return self.data_path / \"config\"\n\n    @field_validator(\"bot_proxy\", mode=\"before\")\n    @classmethod\n    def proxy_config(cls, v: str | None = None) -> dict | None:\n        url = urlparse(v) if v else None\n        if not url:\n            return None\n        return {\n            \"scheme\": url.scheme,\n            \"hostname\": url.hostname,\n            \"port\": url.port,\n            \"username\": url.username,\n            \"password\": url.password,\n        }\n\n    @property\n    def bot_session_name(self) -> str:\n        return f\"bot_{self.bot_token.split(':')[0]}\"\n\n    @field_validator(\"data_path\", mode=\"before\")\n    @classmethod\n    def data_path_init(cls, v: str | Path) -> Path:\n        p = Path(v) if isinstance(v, str) else v\n        p.mkdir(exist_ok=True, parents=True)\n        return p\n\n\nclass WatchdogSettings(BaseSettings):\n    model_config = SettingsConfigDict(\n        env_file=None,\n        extra=\"ignore\",\n        env_prefix=\"WD_\",\n    )\n    is_running: bool = Field(default=False)\n    \"\"\"运行中\"\"\"\n    restart_count: int = Field(default=0)\n    \"\"\"重启次数\"\"\"\n    disconnect_count: int = Field(default=0)\n    \"\"\"断开连接次数\"\"\"\n    max_disconnect_count: int = Field(default=3)\n    \"\"\"最大断开连接次数, 超过后重启\"\"\"\n    remove_session_after_restart: int = Field(default=3)\n    \"\"\"重启失败几次后删除会话文件\"\"\"\n    max_restart_count: int = Field(default=6)\n    \"\"\"意外断开连接时，最大重启次数\"\"\"\n    exit_flag: bool = Field(default=False)\n    \"\"\"退出标志\"\"\"\n\n    def update_bot_restart_count(self) -> None:\n        self.restart_count += 1\n        os.environ[\"WD_RESTART_COUNT\"] = str(self.restart_count)\n\n    def reset_bot_restart_count(self) -> None:\n        self.restart_count = 0\n        os.environ[\"WD_RESTART_COUNT\"] = \"0\"\n\n    def update_bot_disconnect_count(self) -> None:\n        self.disconnect_count += 1\n        os.environ[\"WD_DISCONNECT_COUNT\"] = str(self.disconnect_count)\n\n    def reset_bot_disconnect_count(self) -> None:\n        self.disconnect_count = 0\n        os.environ[\"WD_DISCONNECT_COUNT\"] = \"0\"\n\n\nbs = BotSettings()  # type: ignore[call-arg]\nws = WatchdogSettings()\n"
  },
  {
    "path": "core/platform_config.py",
    "content": "import random\nfrom pathlib import Path\n\nfrom parsehub.types import Platform as PPlatform\nfrom pydantic import BaseModel, ConfigDict, HttpUrl\nfrom yaml import safe_load\n\nfrom log import logger\n\nfrom .config import bs\n\nlogger = logger.bind(name=\"PlatformConfig\")\n\n\nclass Platform(BaseModel):\n    model_config = ConfigDict(extra=\"forbid\")\n\n    disable_parser_proxy: bool = False\n    disable_downloader_proxy: bool = False\n    parser_proxies: list[HttpUrl] | None = None\n    downloader_proxies: list[HttpUrl] | None = None\n    cookies: list[str] | None = None\n\n    def roll_cookie(self) -> str | None:\n        if not self.cookies:\n            return None\n        return random.choice(self.cookies)\n\n    def roll_parser_proxy(self) -> str | None:\n        if not self.parser_proxies:\n            return None\n        return str(random.choice(self.parser_proxies))\n\n    def roll_downloader_proxy(self) -> str | None:\n        if not self.downloader_proxies:\n            return None\n        return str(random.choice(self.downloader_proxies))\n\n\nclass PlatformsConfig(BaseModel):\n    model_config = ConfigDict(extra=\"forbid\")\n\n    default_parser_proxies: list[HttpUrl] | None = None\n    default_downloader_proxies: list[HttpUrl] | None = None\n    platforms: dict[str, Platform] = {}\n\n    @classmethod\n    def load_config(cls, file: Path) -> \"PlatformsConfig\":\n        if not file.exists():\n            logger.info(\"未找到 platform_config.yaml, 跳过加载\")\n            return cls()\n\n        with open(file, encoding=\"utf-8\") as f:\n            data = safe_load(f)\n\n        if not data:\n            logger.info(\"platform_config.yaml 为空, 跳过加载\")\n            return cls()\n\n        platforms = {}\n        if data.get(\"platforms\"):\n            pid_list = [p.id for p in PPlatform]\n            for name, pdata in data[\"platforms\"].items():\n                if name not in pid_list:\n                    logger.error(f\"平台 [{name}] 不存在, 支持的平台id: {pid_list}\")\n                    exit(1)\n\n                if not pdata:\n                    continue\n\n                try:\n                    platforms[name] = Platform(**pdata)\n                except Exception as e:\n                    logger.error(f\"平台 [{name}] 配置错误:\\n{e}\")\n                    raise SystemExit(1) from e\n\n        pc = cls(\n            default_parser_proxies=cls._2l(data.get(\"default_parser_proxies\", None)),\n            default_downloader_proxies=cls._2l(data.get(\"default_downloader_proxies\", None)),\n            platforms=platforms,\n        )\n        logger.debug(f\"已载入平台配置: {pc.model_dump_json(indent=4)}\")\n        return pc\n\n    @staticmethod\n    def _2l[T](v: T | list[T] | None) -> list[T] | None:\n        if v is None:\n            return None\n        if isinstance(v, list):\n            return v\n        return [v]\n\n    def get(self, platform_id: str) -> Platform | None:\n        return self.platforms.get(platform_id)\n\n    def roll_cookie(self, platform_id: str) -> str | None:\n        if not (pc := self.get(platform_id)):\n            return None\n        return pc.roll_cookie()\n\n    def roll_parser_proxy(self, platform_id: str) -> str | None:\n        if not (pc := self.get(platform_id)):\n            pc = Platform()\n        if pc.disable_parser_proxy:\n            return None\n\n        if platform_proxy := pc.roll_parser_proxy():\n            return platform_proxy\n        if self.default_parser_proxies:\n            return str(random.choice(self.default_parser_proxies))\n        return None\n\n    def roll_downloader_proxy(self, platform_id: str) -> str | None:\n        if not (pc := self.get(platform_id)):\n            pc = Platform()\n        if pc.disable_downloader_proxy:\n            return None\n\n        if platform_proxy := pc.roll_downloader_proxy():\n            return platform_proxy\n        if self.default_downloader_proxies:\n            return str(random.choice(self.default_downloader_proxies))\n        return None\n\n\npl_cfg = PlatformsConfig.load_config(bs.config_path / \"platform_config.yaml\")\n"
  },
  {
    "path": "core/watchdog.py",
    "content": "import asyncio\nimport os\nimport sys\n\nfrom pyrogram import Client\nfrom pyrogram.session import Session\n\nfrom core.config import bs, ws\nfrom log import logger\n\nlogger = logger.bind(name=\"Watchdog\")\n\n\nasync def reset_count_task() -> None:\n    \"\"\"重置重启次数任务\"\"\"\n    if ws.restart_count:\n        logger.info(f\"第 {ws.restart_count} 次重启成功, 稳定运行 10 分钟后重置重启次数\")\n    elif ws.disconnect_count:\n        logger.info(\"Bot 重连成功, 稳定运行 10 分钟后重置断开连接次数\")\n\n    await asyncio.sleep(600)\n    ws.reset_bot_disconnect_count()\n    ws.reset_bot_restart_count()\n    logger.info(\"已稳定运行 10 分钟, 次数已重置\")\n\n\nasync def on_connect(_: Client, session: Session) -> None:\n    \"\"\"Bot 连接成功回调函数\"\"\"\n\n    if session.is_media:\n        return\n\n    ws.is_running = True\n    logger.success(\"Bot 开始运行...\")\n\n    if ws.restart_count or ws.disconnect_count:\n        asyncio.create_task(reset_count_task())\n\n\nasync def on_disconnect(cli: Client, session: Session) -> None:\n    \"\"\"Bot 断开连接回调函数\"\"\"\n\n    if session.is_media:\n        return\n\n    if ws.exit_flag:\n        ws.is_running = False\n\n    # 正常退出\n    if ws.exit_flag and not ws.is_running:\n        logger.info(\"Bot 已结束运行\")\n        return\n\n    # 启动失败\n    if not ws.is_running and not ws.restart_count:\n        exit(\"Bot 连接失败, 请检查设备网络和代理配置\")\n\n    # 断开连接\n    if ws.restart_count >= ws.max_restart_count:\n        exit(f\"重启次数已达上限 ({ws.max_restart_count} 次), 结束进程\")\n\n    if ws.disconnect_count < ws.max_disconnect_count:\n        ws.update_bot_disconnect_count()\n        logger.warning(f\"Bot 已断开连接... | {ws.disconnect_count}/{ws.max_disconnect_count}\")\n        return\n\n    if bs.debug:\n        exit(\"Bot 已断开连接, 目前处于调试模式, 已跳过重启\")\n\n    try:\n        ws.update_bot_restart_count()\n        logger.warning(f\"Bot 已断开连接, 尝试重启... | {ws.restart_count}/{ws.max_restart_count}\")\n\n        if ws.restart_count == ws.remove_session_after_restart and not cli.in_memory:\n            await remove_session_file(cli)\n\n        python = sys.executable\n        os.execv(python, [python] + sys.argv)\n    except Exception as e:\n        logger.exception(e)\n        exit(\"重启失败, 结束进程, 以上为错误信息\")\n\n\nasync def remove_session_file(cli: Client) -> None:\n    \"\"\"删除会话文件\"\"\"\n    logger.warning(\"尝试删除会话文件...\")\n    try:\n        if cli.session is not None:\n            await cli.session.stop()\n        await cli.storage.close()\n        if (session := cli.workdir / f\"{cli.name}.session\") and session.exists():\n            os.remove(session)\n            logger.warning(f\"会话文件已移除: {session}\")\n    except Exception as e:\n        logger.error(f\"移除会话文件失败: {e}\")\n"
  },
  {
    "path": "log.py",
    "content": "import inspect\nimport logging\nimport sys\nfrom typing import TYPE_CHECKING, Any\n\nimport loguru\n\nif TYPE_CHECKING:\n    from loguru import Logger\n\nlogger: \"Logger\" = loguru.logger.bind(name=\"Main\")\n\n\ndef formatter(record: Any) -> str:\n    rid = record[\"extra\"].get(\"req_id\")\n    if rid:\n        return (\n            \"<green>{time:HH:mm:ss}</green> | \"\n            \"<level>{level: <8}</level> | \"\n            \"<cyan>{name}:{function}:{line}</cyan> | \"\n            \"<level>[{extra[name]}][{extra[req_id]}] {message}</level>\\n\"\n        )\n    else:\n        return (\n            \"<green>{time:HH:mm:ss}</green> | \"\n            \"<level>{level: <8}</level> | \"\n            \"<cyan>{name}:{function}:{line}</cyan> | \"\n            \"<level>[{extra[name]}] {message}</level>\\n\"\n        )\n\n\ndef setup_logging(debug: bool = False) -> None:\n    logger.remove()\n\n    level = \"DEBUG\" if debug else \"INFO\"\n    logger.add(sys.stderr, level=level, format=formatter)\n\n    logger.add(\n        \"logs/bot.log\",\n        rotation=\"10 MB\",\n        level=\"INFO\",\n        format=formatter,\n        enqueue=True,\n    )\n\n    if debug:\n        logger.debug(\"调试模式已启用\")\n\n\nclass InterceptHandler(logging.Handler):\n    def emit(self, record: logging.LogRecord) -> None:\n        try:\n            level: str | int = logger.level(record.levelname).name\n        except ValueError:\n            level = record.levelno\n\n        frame, depth = inspect.currentframe(), 0\n        while frame:\n            filename = frame.f_code.co_filename\n            is_logging = filename == logging.__file__\n            is_frozen = \"importlib\" in filename and \"_bootstrap\" in filename\n            if depth > 0 and not (is_logging or is_frozen):\n                break\n            frame = frame.f_back\n            depth += 1\n\n        logger.opt(depth=depth, exception=record.exc_info).log(level, record.getMessage())\n\n\nlogging.basicConfig(handlers=[InterceptHandler()], level=\"ERROR\", force=True)\n"
  },
  {
    "path": "plugins/__init__.py",
    "content": ""
  },
  {
    "path": "plugins/filters.py",
    "content": "from typing import Any\n\nfrom pyrogram import filters\nfrom pyrogram.types import InlineQuery, Message\n\nfrom services import ParseService\n\n\nasync def _platform_filter(_: Any, __: Any, update: Message | InlineQuery) -> bool:\n    t: str | None = None\n    match update:\n        case Message():\n            t = update.caption or update.text\n        case InlineQuery():\n            t = update.query\n    try:\n        return bool(t and ParseService().parser.get_platform(t))\n    except Exception:\n        return False\n\n\nplatform_filter = filters.create(_platform_filter)\n"
  },
  {
    "path": "plugins/helpers.py",
    "content": "\"\"\"plugins 共用的工具函数和数据类\"\"\"\n\nfrom dataclasses import dataclass\nfrom pathlib import Path\n\nfrom markdown import markdown\nfrom parsehub import ParseHub, Platform\nfrom parsehub.types import AnyMediaFile, AnyParseResult, DownloadResult, RichTextParseResult\nfrom parsehub.utils.media_info import MediaInfoReader\nfrom pyrogram import Client\n\nfrom log import logger\nfrom utils.converter import clean_article_html\nfrom utils.helpers import to_list\nfrom utils.media_processing_unit import MediaProcessingUnit\nfrom utils.ph import Telegraph\n\nlogger = logger.bind(name=\"Helpers\")\n\n\n@dataclass\nclass ProcessedMedia:\n    source: AnyMediaFile\n    output_paths: list[Path] | None = None\n    output_dir: Path | None = None\n\n\ndef resolve_media_info(processed: \"ProcessedMedia\", file_path: str) -> tuple[int, int, int]:\n    \"\"\"获取媒体的宽、高、时长。若经过转码则从文件读取，否则使用源信息。\"\"\"\n    if processed.output_paths:\n        info = MediaInfoReader.read(file_path)\n        return info.width, info.height, info.duration\n    return processed.source.width, processed.source.height, getattr(processed.source, \"duration\", 0)\n\n\ndef build_caption(parse_result: AnyParseResult, telegraph_url: str | None = None) -> str:\n    return build_caption_by_str(parse_result.title, parse_result.content, parse_result.raw_url, telegraph_url)\n\n\ndef build_caption_by_str(title: str | None, content: str | None, raw_url: str, telegraph_url: str | None = None) -> str:\n    \"\"\"构建消息正文：标题 + 内容 + 来源链接\"\"\"\n    title, content = title or \"\", content or \"\"\n\n    if telegraph_url:\n        label = (title or content[:15]).replace(\"\\n\", \" \") or \"无标题\"\n        body = f\"**[{label}]({telegraph_url})**\"\n    else:\n        parts = []\n        if title:\n            parts.append(f\"**{title}**\")\n        if content:\n            parts.append(content)\n        body = format_text(\"\\n\\n\".join(parts) or \"**无标题**\")\n\n    return f\"{body}\\n\\n<b>▎<a href='{raw_url}'>Source</a></b>\"\n\n\ndef format_text(text: str) -> str:\n    \"\"\"格式化输出内容, 限制长度, 添加折叠块样式\"\"\"\n    text = text.strip()\n    if len(text) > 500 or len(text.splitlines()) > 10:\n        if len(text) > 1000:\n            text = text[:900] + \"......\"\n        return f\"<blockquote expandable>{text}</blockquote>\"\n    else:\n        return text\n\n\ndef progress(current: int, total: int, unit: str) -> str | None:\n    if unit == \"bytes\":\n        if total <= 0:\n            return None\n\n        text = f\"下 载 中... | {current * 100 / total:.0f}%\"\n        if round(current * 100 / total, 1) % 25 == 0:\n            return text\n    else:\n        text = f\"下 载 中... | {current}/{total}\"\n        if (current + 1) % 3 == 0 or (current + 1) == total:\n            return text\n    return None\n\n\nasync def create_telegraph_page(html_content: str, cli: Client, parse_result: AnyParseResult) -> str:\n    \"\"\"创建 Telegraph 页面，返回页面 URL\"\"\"\n    logger.debug(f\"创建 Telegraph 页面: title={parse_result.title}\")\n    me = await cli.get_me()\n    page = await Telegraph().create_page(\n        parse_result.title or \"无标题\",\n        html_content=html_content,\n        author_name=me.full_name,\n        author_url=parse_result.raw_url,\n    )\n    logger.debug(f\"Telegraph 页面已创建: {page.url}\")\n    return page.url\n\n\nasync def create_richtext_telegraph(cli: Client, parse_result: RichTextParseResult) -> str:\n    \"\"\"将富文本解析结果转换为 Telegraph 页面，返回页面 URL\"\"\"\n    logger.debug(f\"富文本转 Telegraph: platform={parse_result.platform}, md_len={len(parse_result.markdown_content)}\")\n    md = parse_result.markdown_content\n    match parse_result.platform:\n        case Platform.WEIXIN:\n            md = md.replace(\"mmbiz.qpic.cn\", \"qpic.cn.in/mmbiz.qpic.cn\")\n        case Platform.COOLAPK:\n            md = md.replace(\"image.coolapk.com\", \"qpic.cn.in/image.coolapk.com\")\n    html = clean_article_html(markdown(md))\n    return await create_telegraph_page(html, cli, parse_result)\n\n\nasync def process_media_files(download_result: DownloadResult) -> list[ProcessedMedia]:\n    \"\"\"对下载结果中的媒体文件进行格式转换，返回 ProcessedMedia 列表\"\"\"\n    processed_dir = download_result.output_dir.joinpath(\"processed\")\n    processor = MediaProcessingUnit(processed_dir, segment_height=1920, logger=logger.bind(name=\"MediaProcessor\").debug)\n    media_files = to_list(download_result.media)\n    logger.debug(f\"开始媒体格式转换: 文件数={len(media_files)}, output_dir={processed_dir}\")\n    processed_list: list[ProcessedMedia] = []\n    for media_file in media_files:\n        # 对于实况图片只处理图片, 不处理视频\n        logger.debug(f\"处理文件: {media_file.path}\")\n        result = await processor.process(media_file.path)\n        logger.debug(f\"处理结果: output_paths={result.output_paths}\")\n        processed_list.append(ProcessedMedia(media_file, result.output_paths, result.temp_dir))\n    logger.debug(f\"媒体格式转换完成: 处理数={len(processed_list)}\")\n    return processed_list\n\n\ndef get_supported_platforms() -> str:\n    text: list[str] = []\n    for i in ParseHub().get_platforms():\n        text.append(f\"**{i['name']}** __({'__, __'.join(i['supported_types'])})__\")\n    text.sort(reverse=True)\n    return \"\\n\".join(text)\n\n\ndef build_start_text() -> str:\n    return (\n        f\"**发送分享链接以进行解析**\\n\\n\"\n        f\"**支持的平台:**\\n\"\n        f\"<blockquote expandable>{get_supported_platforms()}</blockquote>\\n\\n\"\n        f\"**命令列表:**\\n\"\n        f\"`/jx <链接>` - 解析并发送媒体\\n\"\n        f\"`/raw <链接>` - 不处理媒体, 发送原始文件\\n\"\n        f\"`/zip <链接>` - 不处理媒体, 保存解析结果, 发送压缩包\\n\\n\"\n        f\"**开源地址: [GitHub](https://github.com/z-mio/parse_hub_bot)**\"\n    )\n"
  },
  {
    "path": "plugins/inline_parse.py",
    "content": "import asyncio\n\nfrom parsehub import AnyParseResult\nfrom parsehub.types import (\n    AniRef,\n    ImageRef,\n    PostType,\n    VideoRef,\n)\nfrom pyrogram import Client\nfrom pyrogram.errors import FloodWait\nfrom pyrogram.types import (\n    ChosenInlineResult,\n    InlineQuery,\n    InlineQueryResult,\n    InlineQueryResultAnimation,\n    InlineQueryResultArticle,\n    InlineQueryResultCachedAnimation,\n    InlineQueryResultCachedDocument,\n    InlineQueryResultCachedPhoto,\n    InlineQueryResultCachedVideo,\n    InlineQueryResultPhoto,\n    InlineQueryResultVideo,\n    InputMediaVideo,\n    InputTextMessageContent,\n    LinkPreviewOptions,\n)\nfrom pyrogram.types import (\n    InlineKeyboardButton as Ikb,\n)\nfrom pyrogram.types import (\n    InlineKeyboardMarkup as Ikm,\n)\n\nfrom log import logger\nfrom plugins.filters import platform_filter\nfrom plugins.helpers import (\n    build_caption,\n    build_caption_by_str,\n    build_start_text,\n    create_richtext_telegraph,\n    resolve_media_info,\n)\nfrom services import ParseService\nfrom services.cache import CacheEntry, CacheMediaType, parse_cache, persistent_cache\nfrom services.pipeline import ParsePipeline, StatusReporter\nfrom utils.helpers import to_list, with_request_id\n\nlogger = logger.bind(name=\"InlineParse\")\nDEFAULT_THUMB_URL = \"https://telegra.ph/file/cdfdb65b83a4b7b2b6078.png\"\n\n\nclass InlineStatusReporter(StatusReporter):\n    \"\"\"基于 inline_message_id 的状态报告器\"\"\"\n\n    def __init__(self, cli: Client, inline_message_id: str, caption: str = \"\"):\n        self._cli = cli\n        self._mid = inline_message_id\n        self._caption = caption\n        self._last_text: str | None = None\n\n    async def report(self, text: str) -> None:\n        text = f\"**▎{text}**\"\n        full = f\"{self._caption}\\n{text}\" if self._caption else text\n        if full == self._last_text:\n            return\n        self._last_text = full\n        try:\n            await self._cli.edit_inline_text(self._mid, full)\n        except FloodWait:\n            pass\n\n    async def report_error(self, stage: str, error: Exception) -> None:\n        await self._cli.edit_inline_text(\n            self._mid,\n            f\"**▎{stage}错误:** \\n```\\n{error}```\",\n            link_preview_options=LinkPreviewOptions(is_disabled=True),\n        )\n\n        async def fn() -> None:\n            await asyncio.sleep(15)\n            await self._cli.edit_inline_text(\n                self._mid,\n                self._caption,\n                link_preview_options=LinkPreviewOptions(is_disabled=True),\n            )\n\n        loop = asyncio.get_running_loop()\n        loop.create_task(fn())\n\n    async def dismiss(self) -> None:\n        pass\n\n\ndef build_cached_inline_results(entry: CacheEntry, raw_url: str) -> list[InlineQueryResult]:\n    \"\"\"有 file_id 缓存时，构建 cached 类型的 inline 结果（Telegram 服务端直发）\"\"\"\n    if entry.parse_result is None:\n        return []\n    content = entry.parse_result.content\n    caption = build_caption_by_str(entry.parse_result.title, content, raw_url, entry.telegraph_url)\n    title = entry.parse_result.title or \"无标题\"\n\n    # 富文本\n    if entry.telegraph_url:\n        return [\n            InlineQueryResultArticle(\n                title=title,\n                input_message_content=InputTextMessageContent(\n                    caption,\n                    link_preview_options=LinkPreviewOptions(show_above_text=True),\n                ),\n            )\n        ]\n\n    results: list[InlineQueryResult] = []\n    if not entry.media:\n        results.append(\n            InlineQueryResultArticle(\n                title=title,\n                description=content,\n                input_message_content=InputTextMessageContent(\n                    caption,\n                    link_preview_options=LinkPreviewOptions(is_disabled=True),\n                ),\n            )\n        )\n        return results\n\n    for m in entry.media:\n        match m.type:\n            case CacheMediaType.PHOTO:\n                results.append(\n                    InlineQueryResultCachedPhoto(\n                        photo_file_id=m.file_id,\n                        title=title,\n                        caption=caption,\n                        description=content,\n                    )\n                )\n            case CacheMediaType.VIDEO:\n                results.append(\n                    InlineQueryResultCachedVideo(\n                        video_file_id=m.file_id,\n                        title=title,\n                        caption=caption,\n                        description=content,\n                    )\n                )\n            case CacheMediaType.ANIMATION:\n                results.append(\n                    InlineQueryResultCachedAnimation(\n                        animation_file_id=m.file_id,\n                        title=title,\n                        caption=caption,\n                    )\n                )\n            case CacheMediaType.DOCUMENT:\n                results.append(\n                    InlineQueryResultCachedDocument(\n                        document_file_id=m.file_id,\n                        title=title,\n                        caption=caption,\n                        description=content,\n                    )\n                )\n\n    return results\n\n\nasync def build_inline_results(parse_result: AnyParseResult, cli: Client) -> list[InlineQueryResult]:\n    \"\"\"根据解析结果构建内联查询结果列表\"\"\"\n    logger.debug(f\"构建 inline 结果: type={parse_result.type}, title={parse_result.title}\")\n    title = parse_result.title or \"无标题\"\n    media_list = to_list(parse_result.media)\n    reply_markup = Ikm([[Ikb(\"原链接\", url=parse_result.raw_url)]])\n\n    results: list[InlineQueryResult] = []\n\n    # ── 富文本直接 telegraph 发送 ──\n    if parse_result.type == PostType.RICHTEXT:\n        url = await create_richtext_telegraph(cli, parse_result)\n        caption = build_caption(parse_result, url)\n        results.append(\n            InlineQueryResultArticle(\n                title=title,\n                description=parse_result.content,\n                input_message_content=InputTextMessageContent(\n                    caption,\n                    link_preview_options=LinkPreviewOptions(show_above_text=True),\n                ),\n            )\n        )\n        return results\n\n    caption = build_caption(parse_result)\n\n    if not media_list:\n        results.append(\n            InlineQueryResultArticle(\n                title=title,\n                description=parse_result.content,\n                input_message_content=InputTextMessageContent(\n                    caption,\n                    link_preview_options=LinkPreviewOptions(is_disabled=True),\n                ),\n            )\n        )\n        return results\n\n    for index, media_ref in enumerate(media_list):\n        if isinstance(media_ref, ImageRef):\n            results.append(\n                InlineQueryResultPhoto(\n                    media_ref.url,\n                    thumb_url=media_ref.thumb_url,\n                    photo_width=media_ref.width,\n                    photo_height=media_ref.height,\n                    caption=caption,\n                    title=title,\n                    description=parse_result.content,\n                )\n            )\n        elif isinstance(media_ref, VideoRef):\n            results.append(\n                InlineQueryResultPhoto(\n                    media_ref.thumb_url or DEFAULT_THUMB_URL,\n                    photo_width=media_ref.width,\n                    photo_height=media_ref.height,\n                    id=f\"download_{index}\",\n                    title=caption,\n                    caption=caption,\n                    reply_markup=reply_markup,\n                )\n            )\n        elif isinstance(media_ref, AniRef):\n            if media_ref.ext != \"gif\":\n                results.append(\n                    InlineQueryResultVideo(\n                        media_ref.url,\n                        media_ref.thumb_url or DEFAULT_THUMB_URL,\n                        caption=caption,\n                        title=title,\n                        description=parse_result.content,\n                    )\n                )\n            else:\n                results.append(\n                    InlineQueryResultAnimation(\n                        media_ref.url,\n                        thumb_url=media_ref.thumb_url,\n                        caption=caption,\n                        title=title,\n                        description=parse_result.content,\n                    )\n                )\n\n    logger.debug(f\"inline 结果构建完成: count={len(results)}\")\n    return results\n\n\n@Client.on_inline_query(~platform_filter)\nasync def inline_parse_tip(_: Client, inline_query: InlineQuery) -> None:\n    results: list[InlineQueryResult] = [\n        InlineQueryResultArticle(\n            title=\"聚合解析\",\n            description=\"请在聊天框输入链接\",\n            input_message_content=InputTextMessageContent(\n                build_start_text(), link_preview_options=LinkPreviewOptions(is_disabled=True)\n            ),\n            thumb_url=\"https://i.imgloc.com/2023/06/15/Vbfazk.png\",\n        )\n    ]\n    await inline_query.answer(results=results, cache_time=1)\n\n\n@Client.on_inline_query(platform_filter)\n@with_request_id\nasync def call_inline_parse(cli: Client, inline_query: InlineQuery) -> None:\n    logger.info(f\"收到内联解析请求: query={inline_query.query}, from_user={inline_query.from_user.id}\")\n    raw_url = await ParseService().get_raw_url(inline_query.query)\n\n    if cached := await persistent_cache.get(raw_url):\n        logger.debug(\"inline: 缓存命中, 构建 cached 结果\")\n        results = build_cached_inline_results(cached, raw_url)\n        await inline_query.answer(results[:50], cache_time=60)\n        return\n\n    parse_result = await parse_cache.get(raw_url)\n    if parse_result is None:\n        parse_result = await ParseService().parse(inline_query.query)\n        await parse_cache.set(raw_url, parse_result)\n\n    results = await build_inline_results(parse_result, cli)\n    logger.debug(f\"inline 查询完成, 返回 {len(results)} 个结果\")\n    await inline_query.answer(results[:50], cache_time=0)\n\n\n@Client.on_chosen_inline_result()\n@with_request_id\nasync def inline_result_download(cli: Client, chosen_result: ChosenInlineResult) -> None:\n    if not chosen_result.result_id.startswith(\"download_\"):\n        return\n\n    media_index = int(chosen_result.result_id.split(\"_\")[1])\n    inline_message_id = chosen_result.inline_message_id\n    if inline_message_id is None:\n        return\n    query = chosen_result.query\n    logger.debug(f\"inline 下载触发: media_index={media_index}, query={query}\")\n    raw_url = await ParseService().get_raw_url(query)\n\n    cached_result = await parse_cache.get(raw_url)\n    logger.debug(f\"缓存命中: {cached_result is not None}\")\n\n    caption = build_caption(cached_result) if cached_result else \"\"\n    reporter = InlineStatusReporter(cli, inline_message_id, caption)\n    pipeline = ParsePipeline(query, reporter, parse_result=cached_result, singleflight=False)\n    if (result := await pipeline.run()) is None:\n        return\n\n    parse_result = result.parse_result\n    caption = build_caption(parse_result)\n\n    # ── 上传 ──\n    await reporter.report(\"上 传 中...\")\n\n    processed = result.processed_list[media_index]\n    video_ref = parse_result.media[media_index] if isinstance(parse_result.media, list) else parse_result.media\n\n    try:\n        file_paths = processed.output_paths or [processed.source.path]\n        file_path_str = str(file_paths[0])\n        logger.debug(f\"inline 上传文件: {file_path_str}\")\n        width, height, duration = resolve_media_info(processed, file_path_str)\n\n        video_cover = str(video_ref.thumb_url) if video_ref and video_ref.thumb_url else None\n        media = (\n            InputMediaVideo(\n                file_path_str,\n                caption=caption,\n                video_cover=video_cover,\n                duration=duration or 0,\n                width=width or 0,\n                height=height or 0,\n                supports_streaming=True,\n            )\n            if video_cover\n            else InputMediaVideo(\n                file_path_str,\n                caption=caption,\n                duration=duration or 0,\n                width=width or 0,\n                height=height or 0,\n                supports_streaming=True,\n            )\n        )\n        await cli.edit_inline_media(inline_message_id, media=media)\n    except Exception as e:\n        logger.opt(exception=e).debug(\"详细堆栈\")\n        logger.error(f\"inline 上传失败: {e}\")\n        await reporter.report_error(\"上传\", e)\n    finally:\n        logger.debug(\"inline 下载任务完成\")\n        result.cleanup()\n"
  },
  {
    "path": "plugins/parse.py",
    "content": "import asyncio\nimport os\nfrom collections.abc import Awaitable, Callable\nfrom itertools import batched\nfrom typing import Any, Literal\n\nfrom parsehub.types import (\n    AniFile,\n    AnyMediaRef,\n    AnyParseResult,\n    ImageFile,\n    LivePhotoFile,\n    PostType,\n    VideoFile,\n)\nfrom pyrogram import Client, enums, filters\nfrom pyrogram.errors import FloodWait, SlowmodeWait\nfrom pyrogram.types import (\n    InputMediaAnimation,\n    InputMediaDocument,\n    InputMediaPhoto,\n    InputMediaVideo,\n    LinkPreviewOptions,\n    Message,\n)\n\nfrom core import bs\nfrom log import logger\nfrom plugins.filters import platform_filter\nfrom plugins.helpers import (\n    ProcessedMedia,\n    build_caption,\n    build_caption_by_str,\n    create_richtext_telegraph,\n    resolve_media_info,\n)\nfrom services import ParseService\nfrom services.cache import CacheEntry, CacheMedia, CacheMediaType, CacheParseResult, parse_cache, persistent_cache\nfrom services.pipeline import ParsePipeline, PipelineResult, StatusReporter\nfrom utils.helpers import pack_dir_to_tar_gz, to_list, with_request_id\n\nlogger = logger.bind(name=\"Parse\")\nSKIP_DOWNLOAD_THRESHOLD = 0\nMAX_RETRIES = 5\n\n\nasync def _send_with_rate_limit[T](\n    send_coro_fn: Callable[[], Awaitable[T]],\n) -> T:\n    \"\"\"带自动重试的发送包装器。\n\n    Args:\n        send_coro_fn: 返回协程的可调用对象（lambda 或函数），每次重试会重新调用\n    \"\"\"\n    for attempt in range(MAX_RETRIES):\n        try:\n            return await send_coro_fn()\n        except (FloodWait, SlowmodeWait) as e:\n            if attempt < MAX_RETRIES - 1:\n                logger.warning(f\"{e.ID} 重试 ({attempt + 1}/{MAX_RETRIES})，等待 {e.value}s\")\n                await asyncio.sleep(e.value)\n            else:\n                raise e from e\n    raise RuntimeError(\"发送重试失败\")\n\n\nclass MessageStatusReporter(StatusReporter):\n    \"\"\"基于 Telegram Message 的状态报告器\"\"\"\n\n    def __init__(self, user_msg: Message):\n        self._user_msg = user_msg\n        self._msg: Message | None = None\n\n    async def report(self, text: str) -> None:\n        await self._edit_text(f\"**▎{text}**\")\n\n    async def report_error(self, stage: str, error: Exception) -> None:\n        await self._edit_text(\n            f\"**▎{stage}错误:** \\n```\\n{error}```\",\n            link_preview_options=LinkPreviewOptions(is_disabled=True),\n        )\n\n        async def fn() -> None:\n            await asyncio.sleep(15)\n            if self._msg:\n                await self._msg.delete()\n\n        loop = asyncio.get_running_loop()\n        loop.create_task(fn())\n\n    async def dismiss(self) -> None:\n        if self._msg:\n            await self._msg.delete()\n\n    async def _edit_text(self, text: str, **kwargs: Any) -> None:\n        try:\n            if self._msg is None:\n                self._msg = await self._user_msg.reply_text(text, **kwargs)\n            else:\n                if self._msg.text != text:\n                    await self._msg.edit_text(text, **kwargs)\n        except (FloodWait, SlowmodeWait):\n            pass\n\n\n# ── Handler ──────────────────────────────────────────────────────────\n\n\n@Client.on_message(filters.command([\"jx\", \"raw\", \"zip\"]) | ((filters.text | filters.caption) & platform_filter))\nasync def jx(cli: Client, msg: Message) -> None:\n    mode = \"preview\"\n    if msg.command:\n        match msg.command[0]:\n            case \"raw\":\n                mode = \"raw\"\n            case \"jx\":\n                mode = \"preview\"\n            case \"zip\":\n                mode = \"zip\"\n\n        text = \" \".join(msg.command[1:]) if msg.command[1:] else \"\"\n        if not text and msg.reply_to_message:\n            text = msg.reply_to_message.text or msg.reply_to_message.caption or \"\"\n        if not text:\n            await msg.reply_text(\"**▎请加上链接或回复一条消息**\")\n            return\n    else:\n        text = msg.text or msg.caption or \"\"\n\n    tokens = text.strip().split()\n    urls = list({i for i in tokens if ParseService().parser.get_platform(i)})[:10]\n\n    if not urls:\n        await msg.reply_text(\"**▎不支持的平台**\")\n        return\n\n    tasks = [handle_parse(cli, msg, url, mode) for url in urls]\n    await asyncio.gather(*tasks)\n\n\n# ── 主流程 ───────────────────────────────────────────────────────────\n\n\n@with_request_id\nasync def handle_parse(\n    cli: Client, msg: Message, url: str, mode: Literal[\"raw\", \"preview\", \"zip\"] | str = \"preview\"\n) -> None:\n    chat_id = msg.chat.id if msg.chat else None\n    logger.info(f\"收到解析请求: url={url}, chat_id={chat_id}, msg_id={msg.id}, mode={mode}\")\n    reporter = MessageStatusReporter(msg)\n    match mode:\n        case \"raw\":\n            use_caching = False\n            skip_media_processing = True\n            singleflight = False\n            save_metadata = False\n        case \"zip\":\n            use_caching = False\n            skip_media_processing = True\n            singleflight = False\n            save_metadata = True\n        case _:\n            use_caching = True\n            skip_media_processing = False\n            singleflight = True\n            save_metadata = False\n    try:\n        raw_url = await ParseService().get_raw_url(url)\n    except Exception as e:\n        await reporter.report_error(\"获取原始链接\", e)\n        return\n\n    if use_caching and (cached := await persistent_cache.get(raw_url)):\n        logger.debug(\"file_id 缓存命中, 直接发送\")\n        await _send_cached(msg, cached, raw_url)\n        return\n\n    cached_parse_result = await parse_cache.get(raw_url)\n    pipeline = ParsePipeline(\n        url,\n        reporter,\n        parse_result=cached_parse_result,\n        singleflight=singleflight,\n        skip_media_processing=skip_media_processing,\n        skip_download_threshold=SKIP_DOWNLOAD_THRESHOLD,\n        save_metadata=save_metadata,\n    )\n\n    if (result := await pipeline.run()) is None:\n        if pipeline.waited:\n            logger.debug(\"Singleflight 等待完成, 重新检查缓存\")\n            if cached := await persistent_cache.get(raw_url):\n                await _send_cached(msg, cached, raw_url)\n            else:\n                await handle_parse(cli, msg, url, mode=mode)\n                return\n        else:\n            logger.debug(\"Pipeline 返回 None, 跳过后续处理\")\n        return\n\n    parse_result = result.parse_result\n    await parse_cache.set(raw_url, parse_result)\n\n    # ── 富文本 → Telegraph ──\n    if parse_result.type == PostType.RICHTEXT:\n        logger.debug(f\"富文本类型, 创建 Telegraph 页面: title={parse_result.title}\")\n        try:\n            await msg.reply_chat_action(enums.ChatAction.TYPING)\n            ph_url = await create_richtext_telegraph(cli, parse_result)\n            logger.debug(f\"Telegraph 页面创建完成: {ph_url}\")\n            caption = build_caption(parse_result, ph_url)\n            await msg.reply_text(\n                caption,\n                link_preview_options=LinkPreviewOptions(show_above_text=True),\n            )\n            await persistent_cache.set(\n                raw_url,\n                CacheEntry(\n                    parse_result=CacheParseResult(title=parse_result.title, content=parse_result.content),\n                    telegraph_url=ph_url,\n                ),\n            )\n            await reporter.dismiss()\n            return\n        finally:\n            pipeline.finish()\n\n    caption = build_caption(parse_result)\n    if not result.processed_list:\n        logger.debug(\"无媒体文件, 仅发送文本\")\n        await msg.reply_chat_action(enums.ChatAction.TYPING)\n        await msg.reply_text(\n            caption,\n            link_preview_options=LinkPreviewOptions(is_disabled=True),\n        )\n        cache_entry = CacheEntry(parse_result=CacheParseResult(title=parse_result.title, content=parse_result.content))\n        await persistent_cache.set(raw_url, cache_entry)\n        await reporter.dismiss()\n        pipeline.finish()\n        return\n\n    if mode == \"raw\":\n        await _send_raw(msg, result, reporter)\n        return\n    if mode == \"zip\":\n        await _send_zip(msg, result, reporter)\n        return\n\n    # ── 上传媒体 ──\n    logger.debug(f\"开始上传媒体: media_count={len(result.processed_list)}\")\n    await reporter.report(\"上 传 中...\")\n    try:\n        media_cache_entry = await _send_media(msg, parse_result, result.processed_list, caption)\n        if media_cache_entry:\n            await persistent_cache.set(raw_url, media_cache_entry)\n        await reporter.dismiss()\n    except Exception as e:\n        logger.opt(exception=e).debug(\"详细堆栈\")\n        logger.error(f\"上传失败: {e}\")\n        await reporter.report_error(\"上传\", e)\n        return\n    finally:\n        result.cleanup()\n        pipeline.finish()\n\n\n# ── 构建 InputMedia ──────────────────────────────────────────────────\n\n\ndef _build_input_media(\n    media_refs: list[AnyMediaRef],\n    processed_list: list[ProcessedMedia],\n) -> tuple[list[InputMediaPhoto | InputMediaVideo], list[InputMediaAnimation]]:\n    \"\"\"根据处理结果和媒体引用构建 Telegram InputMedia 列表。\n\n    Returns:\n        (photos_videos, animations) 两类媒体列表\n    \"\"\"\n    photos_videos: list[InputMediaPhoto | InputMediaVideo] = []\n    animations: list[InputMediaAnimation] = []\n\n    for media_ref, processed in zip(media_refs, processed_list, strict=False):\n        file_paths = processed.output_paths or [processed.source.path]\n        for file_path in file_paths:\n            file_path_str = str(file_path)\n            width, height, duration = resolve_media_info(processed, file_path_str)\n\n            match processed.source:\n                case ImageFile():\n                    photos_videos.append(InputMediaPhoto(media=file_path_str))\n                case AniFile():\n                    animations.append(InputMediaAnimation(media=file_path_str))\n                case VideoFile():\n                    photos_videos.append(\n                        InputMediaVideo(\n                            media=file_path_str,\n                            video_cover=media_ref.thumb_url,\n                            duration=duration,\n                            width=width,\n                            height=height,\n                            supports_streaming=True,\n                        )\n                    )\n                case LivePhotoFile():\n                    photos_videos.append(\n                        InputMediaVideo(\n                            media=processed.source.video_path,\n                            video_cover=file_path_str,\n                            duration=duration,\n                            width=width,\n                            height=height,\n                            supports_streaming=True,\n                        )\n                    )\n\n    return photos_videos, animations\n\n\n# ── 缓存条目构建 ─────────────────────────────────────────────────────\n\n\ndef _cache_media_from_message(m: Message) -> CacheMedia | None:\n    \"\"\"从已发送的 Telegram Message 提取 CacheMedia。\"\"\"\n    if m.photo:\n        return CacheMedia(type=CacheMediaType.PHOTO, file_id=m.photo.file_id)\n    if m.video:\n        return CacheMedia(\n            type=CacheMediaType.VIDEO,\n            file_id=m.video.file_id,\n            cover_file_id=m.video.video_cover.file_id if m.video.video_cover else None,\n        )\n    if m.animation:\n        return CacheMedia(type=CacheMediaType.ANIMATION, file_id=m.animation.file_id)\n    if m.document:\n        return CacheMedia(type=CacheMediaType.DOCUMENT, file_id=m.document.file_id)\n    return None\n\n\ndef _make_cache_entry(parse_result: AnyParseResult, media_list: list[CacheMedia]) -> CacheEntry:\n    return CacheEntry(\n        parse_result=CacheParseResult(title=parse_result.title, content=parse_result.content),\n        media=media_list,\n    )\n\n\n# ── Raw 模式上传 ──────────────────────────────────────────────────────\n\n\nasync def _send_raw(\n    msg: Message,\n    result: PipelineResult,\n    reporter: MessageStatusReporter,\n) -> None:\n    \"\"\"Raw 模式：将文件以原始文档形式上传。\"\"\"\n    logger.debug(\"Raw 模式, 直接上传文件\")\n    await reporter.report(\"上 传 中...\")\n    try:\n        caption = build_caption(result.parse_result)\n        all_docs: list[InputMediaDocument] = []\n        livephoto_videos: dict[int, InputMediaDocument] = {}\n\n        for idx, processed in enumerate(result.processed_list):\n            file_paths = processed.output_paths or [processed.source.path]\n            file_path = file_paths[0]\n            all_docs.append(InputMediaDocument(media=str(file_path)))\n            if isinstance(processed.source, LivePhotoFile):\n                livephoto_videos[idx] = InputMediaDocument(media=str(processed.source.video_path))\n\n        if len(all_docs) == 1:\n            await msg.reply_chat_action(enums.ChatAction.UPLOAD_DOCUMENT)\n            sent_msg = await _send_with_rate_limit(\n                lambda: msg.reply_document(all_docs[0].media, caption=caption, force_document=True)\n            )\n            if livephoto_videos and sent_msg:\n                await _send_with_rate_limit(\n                    lambda: sent_msg.reply_document(livephoto_videos[0].media, force_document=True)\n                )\n        else:\n            msgs: list[Message] = []\n            for batch in batched(all_docs, 10):\n                await msg.reply_chat_action(enums.ChatAction.UPLOAD_DOCUMENT)\n                # noinspection PyDefaultArgument\n                mg = await _send_with_rate_limit(lambda b=list(batch): msg.reply_media_group(b))  # type: ignore\n                msgs.extend(mg)\n            if livephoto_videos:\n                for idx, media_doc in livephoto_videos.items():\n                    await msg.reply_chat_action(enums.ChatAction.UPLOAD_DOCUMENT)\n                    await _send_with_rate_limit(\n                        lambda m_=media_doc, idx_=idx: msgs[idx_].reply_document(m_.media, force_document=True)  # type: ignore[misc]\n                    )\n            await _send_with_rate_limit(\n                lambda: msg.reply_text(\n                    caption,\n                    link_preview_options=LinkPreviewOptions(is_disabled=True),\n                )\n            )\n\n    except Exception as e:\n        logger.opt(exception=e).debug(\"详细堆栈\")\n        logger.error(f\"Raw 模式上传失败: {e}\")\n        await reporter.report_error(\"上传\", e)\n        return\n    finally:\n        result.cleanup()\n\n    await reporter.dismiss()\n\n\nasync def _send_zip(\n    msg: Message,\n    result: PipelineResult,\n    reporter: MessageStatusReporter,\n) -> None:\n    logger.debug(\"Zip 模式, 开始打包\")\n    await reporter.report(\"打 包 中...\")\n    try:\n        caption = build_caption(result.parse_result)\n        if result.output_dir is None:\n            raise ValueError(\"缺少打包目录\")\n        pack_path = await asyncio.to_thread(pack_dir_to_tar_gz, result.output_dir)\n    except Exception as e:\n        logger.opt(exception=e).debug(\"详细堆栈\")\n        logger.error(f\"打包失败: {e}\")\n        await reporter.report_error(\"打包\", Exception(\"...\"))\n        return\n    finally:\n        result.cleanup()\n\n    await reporter.report(\"上 传 中...\")\n    try:\n        await msg.reply_chat_action(enums.ChatAction.UPLOAD_DOCUMENT)\n        await _send_with_rate_limit(lambda: msg.reply_document(str(pack_path), caption=caption))\n    except Exception as e:\n        logger.opt(exception=e).debug(\"详细堆栈\")\n        logger.error(f\"上传失败: {e}\")\n        await reporter.report_error(\"上传\", e)\n        return\n    finally:\n        if not bs.debug_skip_cleanup:\n            logger.debug(\"清理压缩包\")\n            os.remove(pack_path)\n\n    await reporter.dismiss()\n\n\n# ── 发送媒体 ─────────────────────────────────────────────────────────\n\n\nasync def _send_single(\n    msg: Message,\n    photos_videos: list[InputMediaPhoto | InputMediaVideo],\n    animations: list[InputMediaAnimation],\n    caption: str,\n) -> list[CacheMedia] | None:\n    \"\"\"发送单个媒体，返回 CacheMedia 列表。上传失败时降级为 document。\n    返回 None 表示不缓存\n    \"\"\"\n    media_list: list[CacheMedia] = []\n    all_media = animations + photos_videos\n\n    try:\n        sent: Message | None = None\n        if animations:\n            await msg.reply_chat_action(enums.ChatAction.UPLOAD_PHOTO)\n            sent = await _send_with_rate_limit(lambda: msg.reply_animation(animations[0].media, caption=caption))\n        else:\n            single = photos_videos[0]\n            match single:\n                case InputMediaPhoto():\n                    await msg.reply_chat_action(enums.ChatAction.UPLOAD_PHOTO)\n                    sent = await _send_with_rate_limit(lambda: msg.reply_photo(single.media, caption=caption))\n                case InputMediaVideo():\n                    await msg.reply_chat_action(enums.ChatAction.UPLOAD_VIDEO)\n                    sent = await _send_with_rate_limit(\n                        lambda: msg.reply_video(\n                            single.media,\n                            caption=caption,\n                            video_cover=single.video_cover,\n                            duration=single.duration,\n                            width=single.width,\n                            height=single.height,\n                            supports_streaming=True,\n                        )\n                    )\n\n        if sent and (cm := _cache_media_from_message(sent)):\n            media_list.append(cm)\n    except Exception as e:\n        logger.warning(f\"上传失败 {e}, 使用兼容模式上传\")\n        await msg.reply_chat_action(enums.ChatAction.UPLOAD_DOCUMENT)\n        await _send_with_rate_limit(\n            lambda: msg.reply_document(all_media[0].media, caption=caption, force_document=True)\n        )\n        return None\n\n    return media_list\n\n\nasync def _send_multi(\n    msg: Message,\n    photos_videos: list[InputMediaPhoto | InputMediaVideo],\n    animations: list[InputMediaAnimation],\n    caption: str,\n) -> list[CacheMedia] | None:\n    \"\"\"发送多个媒体（动图逐条、图片视频分批），返回 CacheMedia 列表。\n    返回 None 表示不缓存\n    \"\"\"\n    media_list: list[CacheMedia] = []\n    not_cache = False\n\n    for ani in animations:\n        await msg.reply_chat_action(enums.ChatAction.UPLOAD_PHOTO)\n        caption_ = caption if ani == animations[-1] and not photos_videos else \"\"\n        try:\n            sent = await _send_with_rate_limit(\n                lambda a=ani, c=caption_: msg.reply_animation(  # type: ignore[misc]\n                    a.media,\n                    caption=c,\n                )\n            )\n        except Exception as e:\n            logger.warning(f\"上传失败 {e}, 使用兼容模式上传\")\n            not_cache = True\n            await msg.reply_chat_action(enums.ChatAction.UPLOAD_DOCUMENT)\n            await _send_with_rate_limit(\n                lambda a=ani, c=caption_: msg.reply_document(a.media, caption=c, force_document=True)  # type: ignore[misc]\n            )\n        else:\n            # 过大的 GIF 会返回 document\n            if sent and sent.document:\n                media_list.append(CacheMedia(type=CacheMediaType.DOCUMENT, file_id=sent.document.file_id))\n            elif sent and sent.animation:\n                media_list.append(CacheMedia(type=CacheMediaType.ANIMATION, file_id=sent.animation.file_id))\n\n    try:\n        for batch in batched(photos_videos, 10):\n            if batch[-1] == photos_videos[-1]:\n                batch[0].caption = caption\n\n            await msg.reply_chat_action(enums.ChatAction.UPLOAD_PHOTO)\n            # noinspection PyDefaultArgument\n            sent_msgs = await _send_with_rate_limit(lambda b=list(batch): msg.reply_media_group(media=b))  # type: ignore[misc]\n            for m in sent_msgs:\n                if cm := _cache_media_from_message(m):\n                    media_list.append(cm)\n    except Exception as e:\n        logger.warning(f\"上传失败 {e}, 使用兼容模式上传\")\n        input_documents: list[InputMediaDocument] = [InputMediaDocument(media=item.media) for item in photos_videos]\n        for document_batch in batched(input_documents, 10):\n            if document_batch[-1] == input_documents[-1]:\n                document_batch[0].caption = caption\n\n            await msg.reply_chat_action(enums.ChatAction.UPLOAD_DOCUMENT)\n            # noinspection PyDefaultArgument\n            await _send_with_rate_limit(lambda b=list(document_batch): msg.reply_media_group(media=b))  # type: ignore\n        return None\n\n    return None if not_cache else media_list\n\n\nasync def _send_media(\n    msg: Message, parse_result: AnyParseResult, processed_list: list[ProcessedMedia], caption: str\n) -> CacheEntry | None:\n    \"\"\"构建、发送媒体，并返回缓存条目。\n    返回 None 表示不缓存\n    \"\"\"\n    media_refs: list[AnyMediaRef] = to_list(parse_result.media)\n    photos_videos, animations = _build_input_media(media_refs, processed_list)\n    all_count = len(photos_videos) + len(animations)\n    logger.debug(f\"媒体分类完成: animations={len(animations)}, photos_videos={len(photos_videos)}\")\n\n    if all_count == 1:\n        logger.debug(\"单媒体模式发送\")\n        media_list = await _send_single(msg, photos_videos, animations, caption)\n    else:\n        logger.debug(f\"多媒体模式发送: total={all_count}\")\n        media_list = await _send_multi(msg, photos_videos, animations, caption)\n\n    if media_list is None:\n        return None\n    return _make_cache_entry(parse_result, media_list)\n\n\n# ── 缓存发送 ─────────────────────────────────────────────────────────\n\n\nasync def _send_cached(msg: Message, entry: CacheEntry, url: str) -> None:\n    \"\"\"从 file_id 缓存直接发送，跳过解析/下载/转码\"\"\"\n    logger.debug(f\"缓存发送: media={entry.media}\")\n    if entry.parse_result is None:\n        await persistent_cache.remove(url)\n        return\n    caption = build_caption_by_str(entry.parse_result.title, entry.parse_result.content, url, entry.telegraph_url)\n\n    # 富文本类型\n    if entry.telegraph_url:\n        await msg.reply_text(\n            caption,\n            link_preview_options=LinkPreviewOptions(show_above_text=True),\n        )\n        return\n\n    if not entry.media:\n        await msg.reply_text(\n            caption,\n            link_preview_options=LinkPreviewOptions(is_disabled=True),\n        )\n        return\n\n    if len(entry.media) == 1:\n        await _send_cached_single(msg, entry.media[0], caption)\n    else:\n        await _send_cached_multi(msg, entry.media, caption)\n\n\nasync def _send_cached_single(msg: Message, m: CacheMedia, caption: str) -> None:\n    \"\"\"从缓存发送单个媒体。\"\"\"\n    match m.type:\n        case CacheMediaType.PHOTO:\n            await msg.reply_chat_action(enums.ChatAction.UPLOAD_PHOTO)\n            await _send_with_rate_limit(lambda: msg.reply_photo(m.file_id, caption=caption))\n        case CacheMediaType.VIDEO:\n            await msg.reply_chat_action(enums.ChatAction.UPLOAD_VIDEO)\n            await _send_with_rate_limit(\n                lambda: msg.reply_video(\n                    m.file_id, caption=caption, supports_streaming=True, video_cover=m.cover_file_id\n                )\n            )\n        case CacheMediaType.ANIMATION:\n            await msg.reply_chat_action(enums.ChatAction.UPLOAD_PHOTO)\n            await _send_with_rate_limit(lambda: msg.reply_animation(m.file_id, caption=caption))\n        case CacheMediaType.DOCUMENT:\n            await msg.reply_chat_action(enums.ChatAction.UPLOAD_DOCUMENT)\n            await _send_with_rate_limit(lambda: msg.reply_document(m.file_id, caption=caption, force_document=True))\n\n\nasync def _send_cached_multi(msg: Message, media: list[CacheMedia], caption: str) -> None:\n    \"\"\"从缓存发送多个媒体。\"\"\"\n    animations = [m for m in media if m.type == CacheMediaType.ANIMATION]\n    others = [m for m in media if m.type != CacheMediaType.ANIMATION]\n\n    for ani in animations:\n        await msg.reply_chat_action(enums.ChatAction.UPLOAD_PHOTO)\n        await _send_with_rate_limit(\n            lambda a=ani: msg.reply_animation(  # type: ignore[misc]\n                a.file_id,\n                caption=caption if a == animations[-1] and not others else \"\",\n            )\n        )\n\n    media_group = _build_cached_media_group(others)\n    for batch in batched(media_group, 10):\n        if batch[-1] == media_group[-1]:\n            batch[0].caption = caption\n\n        await msg.reply_chat_action(enums.ChatAction.UPLOAD_PHOTO)\n        # noinspection PyDefaultArgument\n        await _send_with_rate_limit(lambda m=list(batch): msg.reply_media_group(m))  # type: ignore[misc]\n\n\ndef _build_cached_media_group(\n    media: list[CacheMedia],\n) -> list[InputMediaPhoto | InputMediaVideo | InputMediaDocument]:\n    \"\"\"从 CacheMedia 列表构建 Telegram media group。\"\"\"\n    group: list[InputMediaPhoto | InputMediaVideo | InputMediaDocument] = []\n    for m in media:\n        match m.type:\n            case CacheMediaType.PHOTO:\n                group.append(InputMediaPhoto(media=m.file_id))\n            case CacheMediaType.VIDEO:\n                if m.cover_file_id:\n                    group.append(InputMediaVideo(media=m.file_id, supports_streaming=True, video_cover=m.cover_file_id))\n                else:\n                    group.append(InputMediaVideo(media=m.file_id, supports_streaming=True))\n            case CacheMediaType.DOCUMENT:\n                group.append(InputMediaDocument(media=m.file_id))\n    return group\n"
  },
  {
    "path": "plugins/start.py",
    "content": "from pyrogram import Client, filters\nfrom pyrogram.types import LinkPreviewOptions, Message\n\nfrom plugins.helpers import build_start_text\n\n\n@Client.on_message(filters.command([\"start\", \"help\"]))\nasync def start(_: Client, msg: Message) -> None:\n    await msg.reply(\n        build_start_text(),\n        link_preview_options=LinkPreviewOptions(is_disabled=True),\n    )\n"
  },
  {
    "path": "pyproject.toml",
    "content": "[project]\nname = \"parsehubbot\"\nversion = \"0.1.0\"\ndescription = \"Add your description here\"\nreadme = \"README.md\"\nrequires-python = \">=3.12\"\ndependencies = [\n    \"haishoku>=1.1.8\",\n    \"httpx>=0.28.1\",\n    \"kurigram>=2.2.7\",\n    \"loguru>=0.6.0\",\n    \"lxml-html-clean>=0.4.1\",\n    \"markdown>=3.7\",\n    \"parsehub>=2.0.17\",\n    \"pickledb>=1.6\",\n    \"pillow>=12.1.1\",\n    \"pillow-heif>=1.1.1\",\n    \"pydantic>=2.12.5\",\n    \"pydantic-settings>=2.11.0\",\n    \"python-dotenv>=1.0.1\",\n    \"pyyaml>=6.0.3\",\n    \"telegraph>=2.2.0\",\n    \"tgcrypto>=1.2.5\",\n    \"uvloop>=0.22.1 ; sys_platform != 'win32'\",\n    \"winloop>=0.3.1 ; sys_platform == 'win32'\",\n]\n\n[tool.ruff]\nline-length = 120\n\n[tool.ruff.lint]\nselect = [\n    \"E\", # pycodestyle 错误检查\n    \"W\", # pycodestyle 警告检查\n    \"F\", # pyflakes 错误检查\n    \"I\", # isort 导入排序\n    \"B\", # flake8-bugbear 常见错误检查\n    \"C4\", # flake8-comprehensions 列表/字典推导式检查\n    \"UP\", # pyupgrade 自动升级语法\n]\nignore = [\n    \"B008\", # 不在参数默认值中执行函数调用\n    \"C901\", # 函数复杂度过高\n]\n\n[dependency-groups]\ndev = [\n    \"mypy>=2.1.0\",\n]\n\n[tool.mypy]\npython_version = \"3.12\"\nfiles = [\"./\"]\nignore_missing_imports = true\nwarn_return_any = true\nwarn_unused_ignores = true\ncheck_untyped_defs = true\ndisallow_untyped_defs = true\nno_implicit_optional = true\n"
  },
  {
    "path": "services/__init__.py",
    "content": "from .cache import CacheEntry, CacheMedia, CacheMediaType, CacheParseResult, parse_cache, persistent_cache\nfrom .parser import ParseService\nfrom .pipeline import ParsePipeline, PipelineProgressCallback, PipelineResult, StatusReporter\n\n__all__ = [\n    \"ParseService\",\n    \"parse_cache\",\n    \"persistent_cache\",\n    \"CacheEntry\",\n    \"CacheMedia\",\n    \"CacheMediaType\",\n    \"CacheParseResult\",\n    \"ParsePipeline\",\n    \"PipelineResult\",\n    \"PipelineProgressCallback\",\n    \"StatusReporter\",\n]\n"
  },
  {
    "path": "services/cache.py",
    "content": "import asyncio\nimport time\nfrom enum import StrEnum\nfrom typing import Any\n\nfrom pickledb import PickleDB\nfrom pydantic import BaseModel\n\nfrom core import bs\nfrom log import logger\n\n\nclass TTLCache:\n    def __init__(self, ttl: float = 300, cleanup_interval: float = 60, maxsize: int = 0):\n        self._ttl = ttl\n        self._store: dict[str, tuple[Any, float]] = {}\n        self._lock = asyncio.Lock()\n        self.logger = logger.bind(name=\"TTLCache\")\n        self._cleanup_interval = cleanup_interval\n        self._cleanup_task: asyncio.Task | None = None\n        self._maxsize = maxsize\n\n    async def get(self, key: str) -> Any | None:\n        async with self._lock:\n            entry = self._store.get(key)\n            if entry is None:\n                self.logger.debug(f\"缓存未命中: key={key}\")\n                return None\n            value, expire_at = entry\n            if time.monotonic() > expire_at:\n                self.logger.debug(f\"缓存已过期: key={key}\")\n                del self._store[key]\n                return None\n            self.logger.debug(f\"缓存命中: key={key}\")\n            return value\n\n    async def set(self, key: str, value: Any, ttl: float | None = None) -> None:\n        async with self._lock:\n            effective_ttl = ttl or self._ttl\n            self.logger.debug(f\"缓存写入: key={key}, ttl={effective_ttl}s\")\n            if key in self._store:\n                del self._store[key]\n            self._store[key] = (value, time.monotonic() + effective_ttl)\n            await self._evict_overflow_locked()\n\n    async def _evict_overflow_locked(self) -> None:\n        if self._maxsize <= 0:\n            return\n        overflow = len(self._store) - self._maxsize\n        if overflow <= 0:\n            return\n        for key in list(self._store)[:overflow]:\n            del self._store[key]\n        self.logger.debug(f\"缓存数量超限, 淘汰最旧缓存: {overflow} 条\")\n\n    async def pop(self, key: str) -> Any | None:\n        async with self._lock:\n            entry = self._store.pop(key, None)\n            if entry is None:\n                self.logger.debug(f\"缓存 pop 未命中: key={key}\")\n                return None\n            value, expire_at = entry\n            if time.monotonic() > expire_at:\n                self.logger.debug(f\"缓存 pop 已过期: key={key}\")\n                return None\n            self.logger.debug(f\"缓存 pop 命中: key={key}\")\n            return value\n\n    def start_cleanup(self) -> None:\n        \"\"\"启动后台清理任务（需在事件循环运行后调用）\"\"\"\n        if self._cleanup_task is None:\n            self._cleanup_task = asyncio.create_task(self._periodic_cleanup())\n            self.logger.debug(f\"后台清理任务已启动, interval={self._cleanup_interval}s\")\n\n    async def _periodic_cleanup(self) -> None:\n        while True:\n            await asyncio.sleep(self._cleanup_interval)\n            async with self._lock:\n                now = time.monotonic()\n                expired_keys = [k for k, (_, exp) in self._store.items() if now > exp]\n                for k in expired_keys:\n                    del self._store[k]\n                if expired_keys:\n                    self.logger.debug(f\"定时清理过期缓存: {len(expired_keys)} 条\")\n\n\nclass CacheMediaType(StrEnum):\n    PHOTO = \"photo\"\n    VIDEO = \"video\"\n    ANIMATION = \"animation\"\n    DOCUMENT = \"document\"\n\n\nclass CacheParseResult(BaseModel):\n    title: str = \"\"\n    content: str = \"\"\n\n\nclass CacheMedia(BaseModel):\n    type: CacheMediaType\n    file_id: str\n    cover_file_id: str | None = None\n\n\nclass CacheEntry(BaseModel):\n    parse_result: CacheParseResult | None = None\n    media: list[CacheMedia] | None = None\n    telegraph_url: str | None = None\n\n\nclass _StorageWrapper(BaseModel):\n    entry: CacheEntry\n    exp: int = 0\n\n\nclass PersistentCache:\n    def __init__(\n        self,\n        db_path: str,\n        ttl: int,\n        save_interval: float = 5 * 60,\n        cleanup_interval: float = 60 * 60,\n        max_entries: int = 30000,\n    ):\n        self._db = PickleDB(db_path)\n        self._ttl = ttl\n        self.logger = logger.bind(name=\"PersistentCache\")\n        self.logger.debug(f\"缓存已初始化: {db_path}\")\n        self._save_interval = save_interval\n        self._cleanup_interval = cleanup_interval\n        self._max_entries = max_entries\n        self._cleanup_task: asyncio.Task | None = None\n        self._lock = asyncio.Lock()\n        self._loaded = False\n        self._dirty = False\n        self._last_cleanup_at = 0.0\n\n    @property\n    def enabled(self) -> bool:\n        return self._ttl > 0\n\n    async def _ensure_loaded_locked(self) -> None:\n        if self._loaded:\n            return\n        await self._db.load()\n        self._loaded = True\n        self._last_cleanup_at = time.monotonic()\n        removed = await self._evict_overflow_locked()\n        if removed:\n            self._dirty = True\n        self.logger.debug(f\"缓存已加载: {self._db.location}, evicted={removed}\")\n\n    async def _save_locked(self) -> None:\n        if not self._loaded or not self._dirty:\n            return\n        await self._db.save()\n        self._dirty = False\n        self.logger.debug(\"缓存已保存\")\n\n    async def get(self, url: str) -> CacheEntry | None:\n        if not self.enabled:\n            return None\n        async with self._lock:\n            await self._ensure_loaded_locked()\n            data = await self._db.get(url)\n            if data is None:\n                return None\n\n            if data.get(\"exp\", 0) <= time.time():\n                self.logger.debug(f\"缓存过期: key={url}\")\n                if await self._db.remove(url):\n                    self._dirty = True\n                return None\n            self.logger.debug(f\"缓存命中: key={url}\")\n            return _StorageWrapper.model_validate(data).entry\n\n    async def set(self, url: str, entry: CacheEntry) -> None:\n        if not self.enabled:\n            return\n        sw = _StorageWrapper(entry=entry, exp=int(time.time() + self._ttl))\n        async with self._lock:\n            await self._ensure_loaded_locked()\n            await self._db.remove(url)\n            await self._db.set(url, sw.model_dump())\n            removed = await self._evict_overflow_locked()\n            self._dirty = True\n            self.logger.debug(f\"缓存写入: key={url}, evicted={removed}\")\n\n    async def remove(self, url: str) -> None:\n        if not self.enabled:\n            return\n        async with self._lock:\n            await self._ensure_loaded_locked()\n            if await self._db.remove(url):\n                self._dirty = True\n\n    def start_cleanup(self) -> None:\n        \"\"\"启动后台清理任务\"\"\"\n        if not self.enabled:\n            self.logger.debug(\"持久缓存已禁用, 跳过后台任务\")\n            return\n        if self._cleanup_task is None:\n            self._cleanup_task = asyncio.create_task(self._periodic_cleanup())\n            self.logger.debug(\n                f\"后台缓存任务已启动, save_interval={self._save_interval}s, cleanup_interval={self._cleanup_interval}s\"\n            )\n\n    async def close(self) -> None:\n        if self._cleanup_task:\n            self._cleanup_task.cancel()\n            try:\n                await self._cleanup_task\n            except asyncio.CancelledError:\n                pass\n            self._cleanup_task = None\n        if not self.enabled:\n            return\n        async with self._lock:\n            await self._save_locked()\n\n    async def _periodic_cleanup(self) -> None:\n        while True:\n            await asyncio.sleep(self._save_interval)\n            if not self._loaded:\n                continue\n            async with self._lock:\n                now = time.monotonic()\n                if now - self._last_cleanup_at >= self._cleanup_interval:\n                    expired = await self._remove_expired_locked()\n                    overflow = await self._evict_overflow_locked()\n                    if expired or overflow:\n                        self._dirty = True\n                        self.logger.debug(f\"定时清理缓存: expired={expired}, overflow={overflow}\")\n                    self._last_cleanup_at = now\n                await self._save_locked()\n\n    async def _remove_expired_locked(self) -> int:\n        now = time.time()\n        removed = 0\n        all_keys = await self._db.all()\n        for key in all_keys:\n            data = await self._db.get(key)\n            if data and data.get(\"exp\", 0) <= now:\n                await self._db.remove(key)\n                removed += 1\n        return removed\n\n    async def _evict_overflow_locked(self) -> int:\n        if self._max_entries <= 0:\n            return 0\n        keys = await self._db.all()\n        overflow = len(keys) - self._max_entries\n        if overflow <= 0:\n            return 0\n        for key in keys[:overflow]:\n            await self._db.remove(key)\n        return overflow\n\n\nparse_cache = TTLCache(ttl=30 * 60, maxsize=1000)  # 解析结果缓存 30 分钟\npersistent_cache = PersistentCache(\n    str(bs.cache_path / \"cache.json\"),\n    ttl=bs.cache_time * 60,\n    save_interval=bs.cache_save_interval * 60,\n    cleanup_interval=bs.cache_cleanup_interval * 60,\n    max_entries=bs.cache_max_entries,\n)\n"
  },
  {
    "path": "services/parser.py",
    "content": "from typing import Self\n\nfrom parsehub import ParseHub, Platform\nfrom parsehub.types import (\n    AnyParseResult,\n)\n\nfrom core import pl_cfg\nfrom log import logger\n\nlogger = logger.bind(name=\"ParseService\")\n\n\nclass ParseService:\n    _instance: Self | None = None\n\n    def __new__(cls) -> Self:\n        if cls._instance is None:\n            cls._instance = super().__new__(cls)\n        return cls._instance\n\n    def __init__(self) -> None:\n        self.parser = ParseHub()\n\n    def get_platform(self, url: str) -> Platform:\n        p = self.parser.get_platform(url)\n        if not p:\n            raise ValueError(\"不支持的平台\")\n        return p\n\n    async def parse(self, url: str) -> AnyParseResult:\n        logger.debug(f\"开始解析 {url}\")\n        p = self.get_platform(url)\n\n        max_retries = 3\n        for attempt in range(1, max_retries + 1):\n            try:\n                cookie = pl_cfg.roll_cookie(p.id)\n                proxy = pl_cfg.roll_parser_proxy(p.id)\n                logger.debug(f\"使用配置: proxy={proxy}, cookie={cookie}, attempt={attempt}/{max_retries}\")\n                pr = await self.parser.parse(url, cookie=cookie, proxy=proxy)\n                logger.debug(f\"解析完成: {pr}\")\n                return pr\n            except Exception as e:\n                logger.warning(f\"解析失败, attempt={attempt}/{max_retries}, err={e}\")\n                if attempt >= max_retries:\n                    raise Exception(e) from e\n        raise\n\n    async def get_raw_url(self, url: str, clean_all: bool = True) -> str:\n        p = self.get_platform(url)\n\n        max_retries = 3\n        for attempt in range(1, max_retries + 1):\n            try:\n                proxy = pl_cfg.roll_parser_proxy(p.id)\n                logger.debug(f\"使用配置: proxy={proxy}, attempt={attempt}/{max_retries}\")\n                raw_url = await self.parser.get_raw_url(url, proxy=proxy, clean_all=clean_all)\n                logger.debug(f\"原始 URL: {raw_url}\")\n                return str(raw_url)\n            except Exception as e:\n                logger.warning(f\"获取原始 URL 失败, attempt={attempt}/{max_retries}, err={e}\")\n                if attempt >= max_retries:\n                    raise Exception(e) from e\n        raise\n"
  },
  {
    "path": "services/pipeline.py",
    "content": "import asyncio\nimport shutil\nfrom collections.abc import Awaitable, Callable\nfrom dataclasses import dataclass, field\nfrom pathlib import Path\nfrom typing import Any, Protocol\n\nfrom parsehub import DownloadResult\nfrom parsehub.types import AnyParseResult, PostType, ProgressUnit\n\nfrom core import bs, pl_cfg\nfrom log import logger\nfrom plugins.helpers import ProcessedMedia, process_media_files\nfrom services import ParseService\nfrom utils.helpers import to_list\n\nlogger = logger.bind(name=\"Pipeline\")\n\n_inflight: dict[str, asyncio.Event] = {}\n\n\nclass StatusReporter(Protocol):\n    \"\"\"抽象状态通知，由调用方实现\"\"\"\n\n    async def report(self, text: str) -> None: ...\n\n    async def report_error(self, stage: str, error: Exception) -> None: ...\n\n    async def dismiss(self) -> None: ...\n\n\n@dataclass\nclass PipelineResult:\n    parse_result: AnyParseResult\n    processed_list: list[ProcessedMedia] = field(default_factory=list)\n    output_dir: Path | None = None\n\n    def cleanup(self) -> None:\n        if bs.debug_skip_cleanup:\n            logger.debug(\"debug_skip_cleanup=True 跳过清理\")\n            return\n        if self.output_dir:\n            logger.debug(\"清理资源\")\n            shutil.rmtree(self.output_dir, ignore_errors=True)\n\n\nclass PipelineProgressCallback:\n    \"\"\"统一的下载进度回调，依赖 StatusReporter\"\"\"\n\n    def __init__(self, reporter: StatusReporter):\n        self._reporter = reporter\n        self._last_text: str | None = None\n\n    async def __call__(self, current: int, total: int, unit: ProgressUnit, *args: Any, **kwargs: Any) -> None:\n        from plugins.helpers import progress as fmt_progress\n\n        text = fmt_progress(current, total, unit)\n        if not text or text == self._last_text:\n            return\n        self._last_text = text\n        await self._reporter.report(text)\n\n\nclass ParsePipeline:\n    \"\"\"\n    将 解析 → 下载 → 格式转换 封装为一条流水线。\n    上传逻辑仍由调用方负责。\n\n    内置 Singleflight 机制：对同一 URL 的并发调用只会执行一次流水线，\n    其余调用等待 Event 完成后返回 None（调用方应重新检查缓存）。\n    首个调用方在完成上传+缓存后必须调用 finish() 以释放等待者。\n    \"\"\"\n\n    def __init__(\n        self,\n        url: str,\n        reporter: StatusReporter,\n        parse_result: AnyParseResult | None = None,\n        *,\n        singleflight: bool = True,\n        skip_media_processing: bool = False,\n        skip_download_threshold: int = 0,\n        richtext_skip_download: bool = True,\n        save_metadata: bool = False,\n    ):\n        self._url = url\n        self._reporter = reporter\n        self._parse_result = parse_result\n        self._waited = False\n        self._singleflight = singleflight\n        self._skip_media_processing = skip_media_processing\n        self._skip_download_threshold = skip_download_threshold\n        self._richtext_skip_download = richtext_skip_download\n        self._save_metadata = save_metadata\n\n    @property\n    def waited(self) -> bool:\n        \"\"\"是否因 singleflight 而等待了其他流水线\"\"\"\n        return self._waited\n\n    def finish(self) -> None:\n        \"\"\"首个调用方完成上传+缓存后调用，释放所有等待者\"\"\"\n        event = _inflight.pop(self._url, None)\n        if event is not None:\n            event.set()\n\n    async def run(self) -> PipelineResult | None:\n        \"\"\"执行流水线，返回 PipelineResult 或 None（失败时已通知）\"\"\"\n        if self._singleflight:\n            key = self._url\n            existing = _inflight.get(key)\n\n            if existing is not None:\n                self._waited = True\n                logger.debug(f\"Singleflight 命中, 等待已有流水线: url={key}\")\n                await self._reporter.report(\"已有相同任务正在解析, 等待解析完成...\")\n                await existing.wait()\n                await self._reporter.dismiss()\n                return None\n\n            event = asyncio.Event()\n            _inflight[key] = event\n\n        try:\n            result = await self._execute()\n            if result is None:\n                self.finish()  # 流水线失败，立即释放等待者\n            return result\n        except BaseException:\n            self.finish()  # 流水线异常，立即释放等待者\n            raise\n\n    async def _execute(self) -> PipelineResult | None:\n        \"\"\"实际执行流水线逻辑\"\"\"\n        logger.debug(f\"流水线启动: url={self._url}, has_cached_result={self._parse_result is not None}\")\n        ps = ParseService()\n        # ── 1. 解析 ──\n        if self._parse_result is not None:\n            logger.debug(\"使用缓存的解析结果\")\n            parse_result = self._parse_result\n        else:\n            await self._reporter.report(\"解 析 中...\")\n            parse_result = await self._step(\"解析\", lambda: ps.parse(self._url))\n            if parse_result is None:\n                return None\n\n        if self._richtext_skip_download and parse_result.type == PostType.RICHTEXT:\n            logger.debug(\"富文本跳过下载\")\n            return PipelineResult(parse_result=parse_result)\n\n        if self._skip_download_threshold and len(to_list(parse_result.media)) > self._skip_download_threshold:\n            logger.debug(\n                f\"媒体数量({len(to_list(parse_result.media))})大于设定值({self._skip_download_threshold}), 跳过下载\"\n            )\n            return PipelineResult(parse_result=parse_result)\n\n        # ── 2. 下载 ──\n        await self._reporter.report(\"下 载 中...\")\n        p = ps.parser.get_platform(self._url)\n        proxy = pl_cfg.roll_downloader_proxy(p.id)\n        logger.debug(f\"使用配置: proxy={proxy}\")\n        progress_cb = PipelineProgressCallback(self._reporter)\n        download_result: DownloadResult = await self._step(\n            \"下载\",\n            lambda: parse_result.download(\n                bs.download_dir, callback=progress_cb, callback_args=(), proxy=proxy, save_metadata=self._save_metadata\n            ),\n            timeout=60 * 30,  # 30分钟\n        )\n        if download_result is None:\n            return None\n        logger.debug(f\"下载完成: output_dir={download_result.output_dir}\")\n\n        # ── 3. 格式转换 ──\n        if self._skip_media_processing:\n            logger.debug(f\"流水线完成: download_result={download_result}\")\n            processed_list = [ProcessedMedia(i, [i.path]) for i in to_list(download_result.media)]\n            return PipelineResult(\n                parse_result=parse_result, processed_list=processed_list, output_dir=download_result.output_dir\n            )\n\n        await self._reporter.report(\"处 理 中...\")\n        maybe_processed_list = await self._step(\n            \"格式转换\",\n            lambda: process_media_files(download_result),\n            cleanup=lambda: shutil.rmtree(download_result.output_dir, ignore_errors=True),\n        )\n        if maybe_processed_list is None:\n            return None\n        processed_list = maybe_processed_list\n\n        logger.debug(f\"流水线完成: processed_count={len(processed_list)}\")\n        return PipelineResult(\n            parse_result=parse_result,\n            processed_list=processed_list,\n            output_dir=download_result.output_dir,\n        )\n\n    async def _step[T](\n        self,\n        stage: str,\n        action: Callable[[], Awaitable[T]],\n        cleanup: Callable[[], None] | None = None,\n        timeout: float | None = None,\n    ) -> T | None:\n        \"\"\"执行单个步骤，失败时统一处理\"\"\"\n        logger.debug(f\"执行步骤: {stage}\")\n        try:\n            coro = action()\n            if timeout is not None:\n                return await asyncio.wait_for(coro, timeout=timeout)\n            return await coro\n        except TimeoutError:\n            logger.error(f\"{stage}超时 (>{timeout}s)\")\n            await self._reporter.report_error(stage, TimeoutError(f\"{stage}超时 (>{timeout}s)\"))\n            if cleanup:\n                cleanup()\n            return None\n        except Exception as e:\n            logger.exception(e)\n            logger.error(f\"{stage}失败, 以上为错误信息\")\n            await self._reporter.report_error(stage, e)\n            if cleanup:\n                cleanup()\n            return None\n"
  },
  {
    "path": "utils/__init__.py",
    "content": ""
  },
  {
    "path": "utils/converter.py",
    "content": "# FROM https://github.com/mercuree/html-telegraph-poster/blob/7212225e28a0206803c32e67d1185bbfbd1fc181/html_telegraph_poster/converter.py\nimport re\n\nfrom lxml.html.clean import Cleaner\n\nallowed_tags = (\n    \"a\",\n    \"aside\",\n    \"b\",\n    \"blockquote\",\n    \"br\",\n    \"code\",\n    \"em\",\n    \"figcaption\",\n    \"figure\",\n    \"h3\",\n    \"h4\",\n    \"hr\",\n    \"i\",\n    \"iframe\",\n    \"img\",\n    \"li\",\n    \"ol\",\n    \"p\",\n    \"pre\",\n    \"s\",\n    \"strong\",\n    \"u\",\n    \"ul\",\n    \"video\",\n)\ntelegram_embed_script_re = re.compile(\n    r\"\"\"<script(?=[^>]+\\sdata-telegram-post=['\"]([^'\"]+))[^<]+</script>\"\"\",\n    re.IGNORECASE,\n)\npre_content_re = re.compile(r\"<(pre|code)(>|\\s[^>]*>)[\\s\\S]*?</\\1>\")\nline_breaks_inside_pre = re.compile(r\"<br(/?>|\\s[^<>]*>)\")\nline_breaks_and_empty_strings = re.compile(r\"(\\s{2,}|\\s*\\r?\\n\\s*)\")\nheader_re = re.compile(r\"<head[^a-z][\\s\\S]*</head>\")\n\n\ndef clean_article_html(html_string: str) -> str:\n    html_string = html_string.replace(\"<h1\", \"<h3\").replace(\"</h1>\", \"</h3>\")\n    # telegram will convert <b> anyway\n    html_string = re.sub(r\"<(/?)b(?=\\s|>)\", r\"<\\1strong\", html_string)\n    html_string = re.sub(r\"<(/?)(h2|h5|h6)\", r\"<\\1h4\", html_string)\n    # convert telegram embed posts before cleaner\n    html_string = re.sub(\n        telegram_embed_script_re,\n        r'<iframe src=\"https://t.me/\\1\"></iframe>',\n        html_string,\n    )\n    # remove <head> if present (can't do this with Cleaner)\n    html_string = header_re.sub(\"\", html_string)\n\n    c = Cleaner(\n        allow_tags=allowed_tags,\n        style=True,\n        remove_unknown_tags=False,\n        embedded=False,\n        safe_attrs_only=True,\n        safe_attrs=(\"src\", \"href\", \"class\"),\n    )\n    # wrap with div to be sure it is there\n    # (otherwise lxml will add parent element in some cases\n    html_string = f\"<div>{html_string}</div>\"\n    cleaned = c.clean_html(html_string)\n    # remove wrapped div\n    cleaned = cleaned[5:-6]\n    # remove all line breaks and empty strings\n    html_string = replace_line_breaks_except_pre(cleaned)\n    # but replace multiple br tags with one line break, telegraph will convert it to <br class=\"inline\">\n    html_string = re.sub(r\"(<br(/?>|\\s[^<>]*>)\\s*)+\", \"\\n\", html_string)\n\n    return html_string.strip(\" \\t\")\n\n\ndef replace_line_breaks_except_pre(html_string: str, replace_by: str = \" \") -> str:\n    # Remove all line breaks and empty strings, except pre tag\n    # how to make it in one string? :\\\n    pre_ranges = [0]\n    out = \"\"\n\n    # replace non-breaking space with usual space\n    html_string = html_string.replace(\"\\u00a0\", \" \")\n\n    # get <pre> start/end postion\n    for x in pre_content_re.finditer(html_string):\n        start, end = x.start(), x.end()\n        pre_ranges.extend((start, end))\n    pre_ranges.append(len(html_string))\n\n    # all odd elements are <pre>, leave them untouched\n    for k in range(1, len(pre_ranges)):\n        part = html_string[pre_ranges[k - 1] : pre_ranges[k]]\n        if k % 2 == 0:\n            out += line_breaks_inside_pre.sub(\"\\n\", part)\n        else:\n            out += line_breaks_and_empty_strings.sub(replace_by, part)\n    return out\n"
  },
  {
    "path": "utils/event_loop.py",
    "content": "import importlib\nimport sys\n\nfrom log import logger\n\n\ndef setup_optimized_event_loop() -> bool:\n    \"\"\"配置优化的事件循环，自动选择winloop或uvloop\"\"\"\n    is_windows = sys.platform == \"win32\"\n    loop_module = \"winloop\" if is_windows else \"uvloop\"\n\n    try:\n        # 动态导入并安装事件循环\n        module = importlib.import_module(loop_module)\n        module.install()\n        logger.debug(f\"{loop_module} 已启用\")\n        return True\n    except ImportError:\n        logger.debug(f\"{loop_module} 未安装\")\n        logger.debug(\"使用标准 asyncio 事件循环\")\n        return False\n    except Exception as e:\n        logger.debug(f\"启用 {loop_module} 时出错: {e}\")\n        logger.debug(\"使用标准 asyncio 事件循环\")\n        return False\n"
  },
  {
    "path": "utils/helpers.py",
    "content": "import asyncio\nimport functools\nimport tarfile\nimport uuid\nfrom collections.abc import Awaitable, Callable\nfrom pathlib import Path\nfrom typing import Any, overload\n\nfrom log import logger\n\n\nasync def run_cmd(*cmd: str, timeout: float = 30) -> str:\n    \"\"\"运行外部命令并异步读取输出\"\"\"\n    proc = await asyncio.create_subprocess_exec(\n        *cmd,\n        stdout=asyncio.subprocess.PIPE,\n        stderr=asyncio.subprocess.DEVNULL,\n    )\n    try:\n        stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=timeout)\n    except TimeoutError:\n        proc.kill()\n        await proc.wait()\n        return \"\"\n    return stdout.decode().strip()\n\n\n@overload\ndef to_list[T](v: list[T]) -> list[T]: ...\n\n\n@overload\ndef to_list[T](v: T) -> list[T]: ...\n\n\ndef to_list[T](v: T | list[T]) -> list[T]:\n    return v if isinstance(v, list) else [v]\n\n\ndef pack_dir_to_tar_gz(dir_path: str | Path, output_path: str | Path | None = None) -> Path:\n    \"\"\"\n    将目录打包为 tar.gz，返回压缩包路径。\n\n    Args:\n        dir_path: 要打包的目录\n        output_path: 输出压缩包路径；不传则默认生成同名 .tar.gz\n\n    Returns:\n        生成的 tar.gz 文件路径\n    \"\"\"\n    source_dir = Path(dir_path).resolve()\n    if not source_dir.is_dir():\n        raise ValueError(f\"不是有效目录: {source_dir}\")\n\n    if output_path is None:\n        output_path = source_dir.with_suffix(\".tar.gz\")\n    else:\n        output_path = Path(output_path).resolve()\n\n    with tarfile.open(output_path, \"w:gz\") as tar:\n        tar.add(source_dir, arcname=source_dir.name)\n\n    return output_path\n\n\ndef with_request_id[T](func: Callable[..., Awaitable[T]]) -> Callable[..., Awaitable[T]]:\n    @functools.wraps(func)\n    async def wrapper(*args: Any, **kwargs: Any) -> T:\n        request_id = str(uuid.uuid4())[:8]\n        with logger.contextualize(req_id=request_id):\n            return await func(*args, **kwargs)\n\n    return wrapper\n"
  },
  {
    "path": "utils/media_processing_unit.py",
    "content": "\"\"\"媒体处理器 — 将图片/视频转换为 Telegram 兼容格式\"\"\"\n\nimport asyncio\nimport math\nimport mimetypes\nimport os\nimport time\nfrom collections.abc import Callable\nfrom dataclasses import dataclass\nfrom pathlib import Path\n\nfrom haishoku.haishoku import Haishoku\nfrom loguru import logger\nfrom PIL import Image, ImageOps\nfrom PIL.Image import Resampling\n\nfrom utils.helpers import run_cmd\n\n\n@dataclass\nclass MediaProcessResult:\n    \"\"\"统一处理结果\"\"\"\n\n    output_paths: list[Path]\n    temp_dir: Path | None = None\n\n\nclass MediaProcessingUnit:\n    \"\"\"媒体处理器，将媒体转换为 Telegram 兼容的格式\n\n    Telegram 限制：\n    - 图片宽高比 / 高宽比不能超过 20:1\n    - 单次最多发送 10 张图片\n\n    用法：\n        mpu = MediaProcessingUnit(output_dir=Path(\"./output\"))\n        result = await mpu.process(\"media.mp4\")\n    \"\"\"\n\n    def __init__(\n        self,\n        output_dir: str | Path,\n        segment_height: int = 1400,\n        medium_threshold: int = 2,\n        overlap: int = 100,\n        logger: Callable = logger.info,\n    ):\n        self.output_dir = Path(output_dir)\n        self.output_dir.mkdir(parents=True, exist_ok=True)\n        self.segment_height = segment_height\n        self.medium_threshold = medium_threshold\n        self.overlap = overlap\n        self.logger = logger\n\n    # ------------------------------------------------------------------ #\n    #  公共入口\n    # ------------------------------------------------------------------ #\n\n    async def process(self, file_path: str | Path) -> MediaProcessResult:\n        media_type = self.get_media_type_by_mime(file_path)\n        self.logger(f\"开始处理媒体: path={file_path}, type={media_type}\")\n        if media_type == \"image\":\n            return await self.process_image(Path(file_path))\n        elif media_type == \"video\":\n            return await self.process_video(Path(file_path))\n        else:\n            raise ValueError(f\"Unsupported media type: {file_path}\")\n\n    # ------------------------------------------------------------------ #\n    #  图片处理\n    # ------------------------------------------------------------------ #\n\n    async def process_image(self, file_path: Path) -> MediaProcessResult:\n        ext = file_path.suffix.lower()\n        needs_convert = ext in {\".heif\", \".heic\", \".avif\"}\n        intermediates: list[Path] = []  # 统一收集中间文件\n\n        try:\n            if needs_convert:\n                self.logger(f\"图片格式需转换: {ext} -> webp\")\n                source = await asyncio.to_thread(self._img2webp, file_path)\n                intermediates.append(source)\n            else:\n                source = file_path\n\n            if result := await asyncio.to_thread(self._adapt_image, source):\n                return result\n\n            # _adapt_image 无需处理，尝试 downscale\n            if downscaled := await asyncio.to_thread(self._downscale_image, source):\n                intermediates.append(downscaled)\n                source = downscaled\n\n            intermediates = [p for p in intermediates if p != source]\n            return MediaProcessResult(output_paths=[source])\n        finally:\n            for p in intermediates:\n                if p.exists():\n                    self.logger(f\"删除中间文件: {p}\")\n                    os.remove(p)\n\n    def _adapt_image(self, file_path: Path) -> MediaProcessResult | None:\n        \"\"\"分析图片尺寸并做填充 / 切割，返回 None 表示无需处理\"\"\"\n        with Image.open(file_path) as img:\n            w, h = img.width, img.height\n\n        wh_ratio = w / h\n        hw_ratio = h / w\n        self.logger(f\"图片尺寸: {w}x{h}, wh_ratio={wh_ratio:.2f}, hw_ratio={hw_ratio:.2f}\")\n\n        if w >= h:\n            # 横图\n            if wh_ratio <= 20:\n                self.logger(\"横图比例正常，跳过处理\")\n                return None\n            self.logger(\"横图比例超限，需要填充\")\n            padding = self._calc_padding_horizontal(w, h)\n            with Image.open(file_path) as img:\n                return self._pad_image(file_path, img, padding)\n        else:\n            # 竖图\n            if hw_ratio <= 5 or (w < 200 and hw_ratio < 20):\n                self.logger(\"竖图比例正常，跳过处理\")\n                return None\n            if w < 200 and hw_ratio > 20:\n                self.logger(\"窄竖图比例超限，需要填充\")\n                padding = self._calc_padding_vertical(w, h)\n                with Image.open(file_path) as img:\n                    return self._pad_image(file_path, img, padding)\n            # 长图切割\n            segments = h // self.segment_height\n            seg_h = h // 2 if segments < self.medium_threshold else self.segment_height\n            self.logger(f\"长图切割: segments={segments}, seg_h={seg_h}\")\n            return self._split_image(file_path, seg_h)\n\n    def _img2webp(self, file_path: Path) -> Path:\n        with Image.open(file_path) as pil_img:\n            img = pil_img.convert(\"RGBA\") if pil_img.mode != \"RGBA\" else pil_img\n            output = self.output_dir / file_path.with_suffix(\".webp\").name\n            img.save(output, format=\"WEBP\")\n        self.logger(f\"webp 转换完成: {output}\")\n        return output\n\n    def _downscale_image(self, file_path: Path, max_side: int = 2560) -> Path | None:\n        \"\"\"若图片任一边超过 max_side，等比缩放至长边为 max_side，返回新文件路径；无需缩放返回 None\"\"\"\n        with Image.open(file_path) as img:\n            w, h = img.size\n            if max(w, h) <= max_side:\n                return None\n            scale = max_side / max(w, h)\n            new_w, new_h = int(w * scale), int(h * scale)\n            self.logger(f\"图片长边超限({max(w, h)}px > {max_side}px)，缩放: {w}x{h} -> {new_w}x{new_h}\")\n            resized = img.resize((new_w, new_h), Resampling.LANCZOS)\n            ext = (img.format and f\".{img.format.lower()}\") or file_path.suffix\n            out_path = self.output_dir / f\"downscaled_{time.time_ns()}{ext}\"\n            resized.save(out_path)\n        return out_path\n\n    # -- 图片辅助 --------------------------------------------------------- #\n\n    @staticmethod\n    def _calc_padding_horizontal(w: int, h: int) -> tuple[int, int, int, int]:\n        h_padding = w // 20 - h // 2\n        return 0, h_padding, 0, h_padding\n\n    @staticmethod\n    def _calc_padding_vertical(w: int, h: int) -> tuple[int, int, int, int]:\n        w_padding = h // 20 - w // 2\n        return w_padding, 0, w_padding, 0\n\n    @staticmethod\n    def _get_dominant_color(file_path: Path) -> tuple[int, ...]:\n        haishoku = Haishoku.loadHaishoku(str(file_path))\n        return tuple(int(v * 0.8) for v in haishoku.palette[0][1])\n\n    def _pad_image(\n        self,\n        file_path: Path,\n        img: Image.Image,\n        padding: tuple[int, int, int, int],\n    ) -> MediaProcessResult:\n        fill_color = self._get_dominant_color(file_path)\n        padded = ImageOps.expand(img, padding, fill=fill_color)\n        out_path = self.output_dir / f\"padded_{time.time_ns()}.png\"\n        padded.save(out_path)\n        self.logger(f\"填充完成: padding={padding}, color={fill_color}, output={out_path}\")\n        return MediaProcessResult(output_paths=[out_path])\n\n    def _split_image(self, file_path: Path, segment_height: int) -> MediaProcessResult:\n        temp_dir = self.output_dir / f\"split_{time.time_ns()}\"\n        temp_dir.mkdir(parents=True, exist_ok=True)\n        segments = self._do_split(file_path, temp_dir, segment_height)\n        self.logger(f\"图片切割完成: {len(segments)} 段, output_dir={temp_dir}\")\n        return MediaProcessResult(output_paths=segments, temp_dir=temp_dir)\n\n    def _do_split(\n        self,\n        input_path: Path,\n        output_dir: Path,\n        segment_height: int,\n    ) -> list[Path]:\n        with Image.open(input_path) as img:\n            width, height = img.size\n            num_segments = math.ceil(height / segment_height)\n            self.logger(f\"切割参数: size={width}x{height}, segment_h={segment_height}, num={num_segments}\")\n            result: list[Path] = []\n            for i in range(num_segments):\n                top = i * segment_height - (self.overlap if i != 0 else 0)\n                bottom = min((i + 1) * segment_height, height)\n                segment = img.crop((0, top, width, bottom))\n                out_path = output_dir / f\"segment_{i + 1:03d}.png\"\n                segment.save(out_path)\n                result.append(out_path)\n        return result\n\n    # ------------------------------------------------------------------ #\n    #  视频处理\n    # ------------------------------------------------------------------ #\n\n    async def process_video(self, file_path: Path) -> MediaProcessResult:\n        codec = await self.get_video_codec(file_path)\n        self.logger(f\"视频编码: codec={codec}, path={file_path}\")\n\n        converted: Path | None = None\n        if codec != \"h264\":\n            self.logger(\"编码非 h264，开始转码\")\n            converted = await self.ensure_h264(file_path)\n            self.logger(f\"转码完成: {converted}\")\n\n        source = converted or file_path\n        video_size = source.stat().st_size\n        self.logger(f\"视频大小: {video_size / 1024 / 1024:.1f} MB\")\n\n        if video_size > 2 * 1024**3:  # 2 GiB\n            self.logger(\"视频超过 2 GiB，开始分割\")\n            output_paths, output_dir = await self.split_video(source, self.output_dir)\n            if converted:\n                os.remove(converted)\n            return MediaProcessResult(output_paths=output_paths, temp_dir=output_dir)\n\n        return MediaProcessResult(output_paths=[source])\n\n    @staticmethod\n    async def get_video_codec(file_path: Path) -> str:\n        out = await run_cmd(\n            \"ffprobe\",\n            \"-v\",\n            \"error\",\n            \"-select_streams\",\n            \"v:0\",\n            \"-show_entries\",\n            \"stream=codec_name\",\n            \"-of\",\n            \"default=noprint_wrappers=1:nokey=1\",\n            str(file_path),\n        )\n        return out.strip().lower() if out else \"\"\n\n    @staticmethod\n    async def get_duration(file_path: Path) -> float:\n        out = await run_cmd(\n            \"ffprobe\",\n            \"-v\",\n            \"error\",\n            \"-show_entries\",\n            \"format=duration\",\n            \"-of\",\n            \"default=noprint_wrappers=1:nokey=1\",\n            str(file_path),\n        )\n        return float(out.strip()) if out else 0.0\n\n    async def ensure_h264(self, file_path: Path) -> Path:\n        out = self.output_dir / (file_path.stem + \"_h264\" + file_path.suffix)\n        duration = await self.get_duration(file_path)\n        height = await self._get_video_height(file_path)\n\n        cmd = self._build_sw_transcode_cmd(file_path, out, duration, height)\n\n        self.logger(f\"h264 转码: {file_path.name} -> {out.name}, duration={duration:.0f}s, encoder=SW:libx264\")\n\n        proc = await asyncio.create_subprocess_exec(\n            *cmd,\n            stdout=asyncio.subprocess.DEVNULL,\n            stderr=asyncio.subprocess.DEVNULL,\n        )\n        await proc.wait()\n\n        if out.exists() and out.stat().st_size > 0:\n            self.logger(f\"h264 转码成功: size={out.stat().st_size / 1024 / 1024:.1f}MB\")\n            return out\n\n        self.logger(f\"h264 转码失败，返回原文件: {file_path}\")\n        return file_path\n\n    @staticmethod\n    async def _get_video_height(file_path: Path) -> int:\n        out = await run_cmd(\n            \"ffprobe\",\n            \"-v\",\n            \"error\",\n            \"-select_streams\",\n            \"v:0\",\n            \"-show_entries\",\n            \"stream=height\",\n            \"-of\",\n            \"default=noprint_wrappers=1:nokey=1\",\n            str(file_path),\n        )\n        return int(out.strip()) if out and out.strip().isdigit() else 0\n\n    def _build_sw_transcode_cmd(self, file_path: Path, out: Path, duration: float, height: int) -> list[str]:\n        if duration <= 30:\n            preset, crf = \"slow\", \"18\"\n        elif duration <= 60:\n            preset, crf = \"medium\", \"20\"\n        elif duration <= 600:\n            preset, crf = \"fast\", \"23\"\n        elif duration <= 1800:\n            preset, crf = \"veryfast\", \"26\"\n        else:\n            preset, crf = \"ultrafast\", \"28\"\n\n        scale = [\"-vf\", \"scale=-2:720\"] if duration > 1800 and height > 720 else []\n        self.logger(f\"SW 转码策略: preset={preset}, crf={crf}, scale={'720p' if scale else 'original'}\")\n\n        return [\n            \"ffmpeg\",\n            \"-i\",\n            str(file_path),\n            \"-c:v\",\n            \"libx264\",\n            \"-preset\",\n            preset,\n            \"-crf\",\n            crf,\n            *scale,\n            \"-c:a\",\n            \"aac\",\n            \"-y\",\n            str(out),\n        ]\n\n    async def split_video(\n        self,\n        file_path: Path,\n        output_dir: Path,\n        size_limit: int = 2_000_000_000,\n        ffmpeg_args: list[str] | None = None,\n        keep_sec: float = 1.0,\n    ) -> tuple[list[Path], Path]:\n        if ffmpeg_args is None:\n            ffmpeg_args = [\"-c\", \"copy\"]\n\n        base = file_path.stem\n        split_dir = output_dir / f\"{base}_split\"\n        split_dir.mkdir(parents=True, exist_ok=True)\n        ext = file_path.suffix.lstrip(\".\")\n        total_duration = int(await self.get_duration(file_path))\n        self.logger(f\"视频分割: duration={total_duration}s, size_limit={size_limit}\")\n\n        cur, part, output_paths = 0, 1, []\n        while cur < total_duration:\n            out_file = split_dir / f\"{base}_part_{part:03d}.{ext}\"\n            output_paths.append(out_file)\n            cmd = [\n                \"ffmpeg\",\n                \"-ss\",\n                str(cur),\n                \"-i\",\n                str(file_path),\n                \"-fs\",\n                str(size_limit),\n                *ffmpeg_args,\n                \"-y\",\n                str(out_file),\n            ]\n            proc = await asyncio.create_subprocess_exec(\n                *cmd,\n                stdout=asyncio.subprocess.DEVNULL,\n                stderr=asyncio.subprocess.DEVNULL,\n            )\n            await proc.wait()\n\n            new_dur = int(await self.get_duration(out_file))\n            self.logger(f\"分割 part {part}: offset={cur}s, duration={new_dur}s, file={out_file}\")\n            if new_dur <= 0:\n                break\n            cur += new_dur\n            if cur < total_duration:\n                cur = max(cur - int(keep_sec), 0)\n            part += 1\n\n        self.logger(f\"视频分割完成: {len(output_paths)} 段\")\n        return output_paths, split_dir\n\n    # ------------------------------------------------------------------ #\n    #  工具方法\n    # ------------------------------------------------------------------ #\n\n    @staticmethod\n    def get_media_type_by_mime(file_path: str | Path) -> str:\n        mime, _ = mimetypes.guess_type(str(file_path))\n        if mime:\n            if mime.startswith(\"image/\"):\n                return \"image\"\n            if mime.startswith(\"video/\"):\n                return \"video\"\n        return \"unknown\"\n\n\nasync def main() -> None:\n    mpu = MediaProcessingUnit(output_dir=Path(r\"D:\\Downloads\\新建文件夹\"))\n    result = await mpu.process(r\"D:\\Downloads\\36751083810-1-30066.mp4\")\n    print(result.output_paths)\n\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n"
  },
  {
    "path": "utils/ph.py",
    "content": "import random\nfrom dataclasses import dataclass\nfrom typing import Any\n\nfrom telegraph.aio import Telegraph as TelegraphAPI\n\n\nclass Telegraph:\n    \"\"\"Telegraph API 封装\"\"\"\n\n    def __init__(self, token: str | None = None, domain: str = \"telegra.ph\"):\n        self.token = token\n        self.domain = domain\n        self.telegraph = TelegraphAPI(access_token=token, domain=domain)\n\n    async def create_account(\n        self, short_name: str, author_name: str | None = None, author_url: str | None = None\n    ) -> \"TelegraphAccount\":\n        \"\"\"创建 Telegraph 账户\"\"\"\n        account = await self.telegraph.create_account(short_name, author_name, author_url)\n        acc_info = await self.get_account_info(account)\n        self.token = acc_info.access_token\n        return acc_info\n\n    async def get_account_info(self, account_info: dict[str, str] | None = None) -> \"TelegraphAccount\":\n        \"\"\"获取 Telegraph 账户信息\"\"\"\n        account_info = account_info or await self.telegraph.get_account_info(\n            [\n                \"short_name\",\n                \"author_name\",\n                \"author_url\",\n                \"auth_url\",\n            ]\n        )\n        return TelegraphAccount(\n            self.telegraph.get_access_token(),\n            account_info[\"short_name\"],\n            account_info[\"author_name\"],\n            account_info[\"author_url\"],\n            account_info[\"auth_url\"],\n        )\n\n    async def create_page(\n        self,\n        title: str,\n        content: list[dict[str, Any]] | None = None,\n        html_content: str | None = None,\n        author_name: str | None = None,\n        author_url: str | None = None,\n        return_content: bool = False,\n        auto_create_account: bool = True,\n    ) -> \"TelegraphPage\":\n        \"\"\"创建 Telegraph 页面\"\"\"\n        if auto_create_account and not self.token:\n            # 随机用户名\n            short_name = \"tg_\" + str(random.randint(100000, 999999))\n            await self.create_account(short_name)\n        response = await self.telegraph.create_page(\n            title,\n            content,\n            html_content,\n            author_name,\n            author_url,\n            return_content,\n        )\n        return TelegraphPage(\n            response[\"path\"],\n            response[\"url\"],\n            response[\"title\"],\n            response[\"description\"],\n            response[\"views\"],\n            response[\"can_edit\"],\n            await self.get_account_info(),\n        )\n\n\n@dataclass\nclass TelegraphAccount:\n    access_token: str\n    short_name: str\n    author_name: str\n    author_url: str\n    auth_url: str\n\n\n@dataclass\nclass TelegraphPage:\n    path: str\n    url: str\n    title: str\n    description: str\n    views: int\n    can_edit: bool\n    account: TelegraphAccount\n"
  }
]