[
  {
    "path": ".gitignore",
    "content": "# Byte-compiled / optimized / DLL files\r\n__pycache__/\r\n*.py[cod]\r\n*$py.class\r\n\r\n# C extensions\r\n*.so\r\n\r\n# Distribution / packaging\r\n.Python\r\nbuild/\r\ndevelop-eggs/\r\ndist/\r\ndownloads/\r\neggs/\r\n.eggs/\r\nlib/\r\nlib64/\r\nparts/\r\nsdist/\r\nvar/\r\nwheels/\r\npip-wheel-metadata/\r\nshare/python-wheels/\r\n*.egg-info/\r\n.installed.cfg\r\n*.egg\r\nMANIFEST\r\n\r\n# PyInstaller\r\n#  Usually these files are written by a python script from a template\r\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\r\n*.manifest\r\n*.spec\r\n\r\n# Installer logs\r\npip-log.txt\r\npip-delete-this-directory.txt\r\n\r\n# Unit test / coverage reports\r\nhtmlcov/\r\n.tox/\r\n.nox/\r\n.coverage\r\n.coverage.*\r\n.cache\r\nnosetests.xml\r\ncoverage.xml\r\n*.cover\r\n*.py,cover\r\n.hypothesis/\r\n.pytest_cache/\r\n\r\n# Translations\r\n*.mo\r\n*.pot\r\n\r\n# Django stuff:\r\n*.log\r\nlocal_settings.py\r\ndb.sqlite3\r\ndb.sqlite3-journal\r\n\r\n# Flask stuff:\r\ninstance/\r\n.webassets-cache\r\n\r\n# Scrapy stuff:\r\n.scrapy\r\n\r\n# Sphinx documentation\r\ndocs/_build/\r\n\r\n# PyBuilder\r\ntarget/\r\n\r\n# Jupyter Notebook\r\n.ipynb_checkpoints\r\n\r\n# IPython\r\nprofile_default/\r\nipython_config.py\r\n\r\n# pyenv\r\n.python-version\r\n\r\n# pipenv\r\n#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.\r\n#   However, in case of collaboration, if having platform-specific dependencies or dependencies\r\n#   having no cross-platform support, pipenv may install dependencies that don't work, or not\r\n#   install all needed dependencies.\r\n#Pipfile.lock\r\n\r\n# PEP 582; used by e.g. github.com/David-OConnor/pyflow\r\n__pypackages__/\r\n\r\n# Celery stuff\r\ncelerybeat-schedule\r\ncelerybeat.pid\r\n\r\n# SageMath parsed files\r\n*.sage.py\r\n\r\n# Environments\r\n.env\r\n.venv\r\nenv/\r\nvenv/\r\nENV/\r\nenv.bak/\r\nvenv.bak/\r\n\r\n# Spyder project settings\r\n.spyderproject\r\n.spyproject\r\n\r\n# Rope project settings\r\n.ropeproject\r\n\r\n# mkdocs documentation\r\n/site\r\n\r\n# mypy\r\n.mypy_cache/\r\n.dmypy.json\r\ndmypy.json\r\n\r\n# Pyre type checker\r\n.pyre/\r\n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\r\n\r\nCopyright (c) 2022 社会易姐QwQ\r\n\r\nPermission is hereby granted, free of charge, to any person obtaining a copy\r\nof this software and associated documentation files (the \"Software\"), to deal\r\nin the Software without restriction, including without limitation the rights\r\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\r\ncopies of the Software, and to permit persons to whom the Software is\r\nfurnished to do so, subject to the following conditions:\r\n\r\nThe above copyright notice and this permission notice shall be included in all\r\ncopies or substantial portions of the Software.\r\n\r\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\r\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\r\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\r\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\r\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\r\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\r\nSOFTWARE.\r\n"
  },
  {
    "path": "README.md",
    "content": "<h1 align=\"center\">Bcut-ASR</h1>\r\n\r\n使用必剪 API 进行云端语音字幕识别，支持 CLI 和 module 调用\r\n\r\n## ✨Feature\r\n\r\n- 可直接上传`flac`, `aac`, `m4a`, `mp3`, `wav`音频格式\r\n- 自动调用 ffmpeg, 实现视频伴音和其他音频格式转码\r\n- 支持`srt`, `json`, `lrc`, `txt`格式字幕输出\r\n- 字幕支持断句和首位时间标记\r\n- 可使用 stdout 输出字幕文本\r\n\r\n## 🚀Install\r\n\r\n首先确保 ffmpeg 已安装，且 PATH 中可以访问，若未安装可以使用如下命令（已安装请无视）：\r\n\r\nLinux：\r\n\r\n```bash\r\nsudo apt install ffmpeg\r\n```\r\n\r\nWindows：\r\n\r\n```powershell\r\nwinget install ffmpeg\r\n```\r\n\r\n本项目暂时未发布 pypi，应使用本地安装，Python 版本应 >= 3.10，需要安装 poetry \r\n\r\n```bash\r\ngit clone https://github.com/SocialSisterYi/bcut-asr\r\ncd bcut-asr\r\npoetry lock\r\npoetry build -f wheel\r\npip install dist/bcut_asr-0.0.3-py3-none-any.whl # Example\r\n```\r\n\r\n## 📃Usage\r\n\r\n### CLI Interface\r\n\r\n```bash\r\nbcut_asr video.mp4\r\n```\r\n\r\n或\r\n\r\n```bash\r\nbcut_asr video.mp4 subtitle.srt\r\n```\r\n\r\n或\r\n\r\n```bash\r\nbcut_asr video.mp4 -f srt - > subtitle.srt\r\n```\r\n\r\n长音频指定任务状态轮询间隔(秒)，避免接口频繁调用\r\n\r\n```bash\r\nbcut_asr video.mp4 -f srt -i 30 - > subtitle.srt\r\n```\r\n\r\n```\r\nbcut_asr -h\r\nusage: bcut-asr [-h] [-f [{srt,json,lrc,txt}]] [-i [1.0]] input [output]\r\n\r\n必剪语音识别\r\n\r\npositional arguments:\r\n  input                 输入媒体文件\r\n  output                输出字幕文件, 可stdout\r\n\r\noptions:\r\n  -h, --help            show this help message and exit\r\n  -f [{srt,json,lrc,txt}], --format [{srt,json,lrc,txt}]\r\n                        输出字幕格式\r\n  -i [1.0], --interval [1.0]\r\n                        任务状态轮询间隔(秒)\r\n\r\n支持输入音频格式: flac, aac, m4a, mp3, wav 支持自动调用ffmpeg提取视频伴音\r\n```\r\n\r\n### Module\r\n\r\n```python\r\nfrom bcut_asr import BcutASR\r\nfrom bcut_asr.orm import ResultStateEnum\r\n\r\nasr = BcutASR('voice.mp3')\r\nasr.upload() # 上传文件\r\nasr.create_task() # 创建任务\r\n\r\n# 轮询检查结果\r\nwhile True:\r\n    result = asr.result()\r\n    # 判断识别成功\r\n    if result.state == ResultStateEnum.COMPLETE:\r\n        break\r\n\r\n# 解析字幕内容\r\nsubtitle = result.parse()\r\n# 判断是否存在字幕\r\nif subtitle.has_data():\r\n    # 输出srt格式\r\n    print(subtitle.to_srt())\r\n```\r\n\r\n输入视频\r\n\r\n```python\r\nfrom bcut_asr import run_everywhere\r\nfrom argparse import Namespace\r\n\r\n\r\nf = open(\"file.mp4\", \"rb\")\r\nargg = Namespace(format=\"srt\", interval=30.0, input=f, output=None)\r\nrun_everywhere(argg)\r\n\r\n```\r\n"
  },
  {
    "path": "bcut_asr/__init__.py",
    "content": "import logging\r\nimport sys\r\nimport time\r\nfrom os import PathLike\r\nfrom pathlib import Path\r\nfrom typing import Literal, Optional\r\n\r\nimport ffmpeg\r\nimport requests\r\n\r\nfrom .orm import (\r\n    ResourceCompleteRspSchema,\r\n    ResourceCreateRspSchema,\r\n    ResultRspSchema,\r\n    ResultStateEnum,\r\n    TaskCreateRspSchema,\r\n)\r\n\r\n__version__ = \"0.0.3\"\r\n\r\nAPI_BASE_URL = \"https://member.bilibili.com/x/bcut/rubick-interface\"\r\n\r\n# 申请上传\r\nAPI_REQ_UPLOAD = API_BASE_URL + \"/resource/create\"\r\n\r\n# 提交上传\r\nAPI_COMMIT_UPLOAD = API_BASE_URL + \"/resource/create/complete\"\r\n\r\n# 创建任务\r\nAPI_CREATE_TASK = API_BASE_URL + \"/task\"\r\n\r\n# 查询结果\r\nAPI_QUERY_RESULT = API_BASE_URL + \"/task/result\"\r\n\r\nSUPPORT_SOUND_FORMAT = Literal[\"flac\", \"aac\", \"m4a\", \"mp3\", \"wav\"]\r\n\r\nINFILE_FMT = [\"flac\", \"aac\", \"m4a\", \"mp3\", \"wav\"]\r\nOUTFILE_FMT = [\"srt\", \"json\", \"lrc\", \"txt\"]\r\n\r\n\r\ndef ffmpeg_render(media_file: str) -> bytes:\r\n    \"提取视频伴音并转码为aac格式\"\r\n    out, err = (\r\n        ffmpeg.input(media_file, v=\"warning\")\r\n        .output(\"pipe:\", ac=1, format=\"adts\")\r\n        .run(capture_stdout=True)\r\n    )\r\n    return out\r\n\r\n\r\ndef run_everywhere(argg):\r\n    logging.basicConfig(\r\n        format=\"%(asctime)s - [%(levelname)s] %(message)s\", level=logging.INFO\r\n    )\r\n    # 处理输入文件情况\r\n    infile = argg.input\r\n    infile_name = infile.name\r\n    if infile_name == \"<stdin>\":\r\n        logging.error(\"输入文件错误\")\r\n        sys.exit(-1)\r\n    suffix = infile_name.rsplit(\".\", 1)[-1]\r\n    if suffix in INFILE_FMT:\r\n        infile_fmt = suffix\r\n        infile_data = infile.read()\r\n    else:\r\n        # ffmpeg分离视频伴音\r\n        logging.info(\"非标准音频文件, 尝试调用ffmpeg转码\")\r\n        try:\r\n            infile_data = ffmpeg_render(infile_name)\r\n        except ffmpeg.Error:\r\n            logging.error(\"ffmpeg转码失败\")\r\n            sys.exit(-1)\r\n        else:\r\n            logging.info(\"ffmpeg转码完成\")\r\n            infile_fmt = \"aac\"\r\n\r\n    # 处理输出文件情况\r\n    outfile = argg.output\r\n    if outfile is None:\r\n        # 未指定输出文件，默认为文件名同输入，可以 -t 传参，默认str格式\r\n        if argg.format is not None:\r\n            outfile_fmt = argg.format\r\n        else:\r\n            outfile_fmt = \"srt\"\r\n    else:\r\n        # 指定输出文件\r\n        outfile_name = outfile.name\r\n        if outfile.name == \"<stdout>\":\r\n            # stdout情况，可以 -t 传参，默认str格式\r\n            if argg.format is not None:\r\n                outfile_fmt = argg.format\r\n            else:\r\n                outfile_fmt = \"srt\"\r\n        else:\r\n            suffix = outfile_name.rsplit(\".\", 1)[-1]\r\n            if suffix in OUTFILE_FMT:\r\n                outfile_fmt = suffix\r\n            else:\r\n                logging.error(\"输出格式错误\")\r\n                sys.exit(-1)\r\n\r\n    interval = argg.interval\r\n    if interval is None:\r\n        interval = 30.0\r\n    # 开始执行转换逻辑\r\n    asr = BcutASR()\r\n    asr.set_data(raw_data=infile_data, data_fmt=infile_fmt)\r\n    try:\r\n        # 上传文件\r\n        asr.upload()\r\n        # 创建任务\r\n        task_id = asr.create_task()\r\n        while True:\r\n            # 轮询检查任务状态\r\n            task_resp = asr.result()\r\n            match task_resp.state:\r\n                case ResultStateEnum.STOP:\r\n                    logging.info(f\"等待识别开始\")\r\n                case ResultStateEnum.RUNING:\r\n                    logging.info(f\"识别中-{task_resp.remark}\")\r\n                case ResultStateEnum.ERROR:\r\n                    logging.error(f\"识别失败-{task_resp.remark}\")\r\n                    sys.exit(-1)\r\n                case ResultStateEnum.COMPLETE:\r\n                    outfile_name = f\"{infile_name.rsplit('.', 1)[-2]}.{outfile_fmt}\"\r\n                    outfile = open(outfile_name, \"w\", encoding=\"utf8\")\r\n                    logging.info(f\"识别成功\")\r\n                    # 识别成功, 回读字幕数据\r\n                    result = task_resp.parse()\r\n                    break\r\n            time.sleep(interval)\r\n        if not result.has_data():\r\n            logging.error(\"未识别到语音\")\r\n            sys.exit(-1)\r\n        match outfile_fmt:\r\n            case \"srt\":\r\n                outfile.write(result.to_srt())\r\n            case \"lrc\":\r\n                outfile.write(result.to_lrc())\r\n            case \"json\":\r\n                outfile.write(result.json())\r\n            case \"txt\":\r\n                outfile.write(result.to_txt())\r\n        outfile.close()\r\n        logging.info(f\"转换成功: {outfile_name}\")\r\n    except APIError as err:\r\n        logging.error(f\"接口错误: {err.__str__()}\")\r\n        sys.exit(-1)\r\n\r\n\r\nclass APIError(Exception):\r\n    \"接口调用错误\"\r\n\r\n    def __init__(self, code, msg) -> None:\r\n        self.code = code\r\n        self.msg = msg\r\n        super().__init__()\r\n\r\n    def __str__(self) -> str:\r\n        return f\"{self.code}:{self.msg}\"\r\n\r\n\r\nclass BcutASR:\r\n    \"必剪 语音识别接口\"\r\n    session: requests.Session\r\n    sound_name: str\r\n    sound_bin: bytes\r\n    sound_fmt: SUPPORT_SOUND_FORMAT\r\n    __in_boss_key: str\r\n    __resource_id: str\r\n    __upload_id: str\r\n    __upload_urls: list[str]\r\n    __per_size: int\r\n    __clips: int\r\n    __etags: list[str]\r\n    __download_url: str\r\n    task_id: str\r\n\r\n    def __init__(self, file: Optional[str | PathLike] = None) -> None:\r\n        self.session = requests.Session()\r\n        self.task_id = None\r\n        self.__etags = []\r\n        if file:\r\n            self.set_data(file)\r\n\r\n    def set_data(\r\n        self,\r\n        file: Optional[str | PathLike] = None,\r\n        raw_data: Optional[bytes] = None,\r\n        data_fmt: Optional[SUPPORT_SOUND_FORMAT] = None,\r\n    ) -> None:\r\n        \"设置欲识别的数据\"\r\n        if file:\r\n            if not isinstance(file, (str, PathLike)):\r\n                raise TypeError(\"unknow file ptr\")\r\n            # 文件类\r\n            file = Path(file)\r\n            self.sound_bin = open(file, \"rb\").read()\r\n            suffix = data_fmt or file.suffix[1:]\r\n            self.sound_name = file.name\r\n        elif raw_data:\r\n            # bytes类\r\n            self.sound_bin = raw_data\r\n            suffix = data_fmt\r\n            self.sound_name = f\"{int(time.time())}.{suffix}\"\r\n        else:\r\n            raise ValueError(\"none set data\")\r\n        if suffix not in SUPPORT_SOUND_FORMAT.__args__:\r\n            raise TypeError(\"format is not support\")\r\n        self.sound_fmt = suffix\r\n        logging.info(f\"加载文件成功: {self.sound_name}\")\r\n\r\n    def upload(self) -> None:\r\n        \"申请上传\"\r\n        if not self.sound_bin or not self.sound_fmt:\r\n            raise ValueError(\"none set data\")\r\n        resp = self.session.post(\r\n            API_REQ_UPLOAD,\r\n            data={\r\n                \"type\": 2,\r\n                \"name\": self.sound_name,\r\n                \"size\": len(self.sound_bin),\r\n                \"resource_file_type\": self.sound_fmt,\r\n                \"model_id\": 7,\r\n            },\r\n        )\r\n        resp.raise_for_status()\r\n        resp = resp.json()\r\n        code = resp[\"code\"]\r\n        if code:\r\n            raise APIError(code, resp[\"message\"])\r\n        resp_data = ResourceCreateRspSchema.parse_obj(resp[\"data\"])\r\n        self.__in_boss_key = resp_data.in_boss_key\r\n        self.__resource_id = resp_data.resource_id\r\n        self.__upload_id = resp_data.upload_id\r\n        self.__upload_urls = resp_data.upload_urls\r\n        self.__per_size = resp_data.per_size\r\n        self.__clips = len(resp_data.upload_urls)\r\n        logging.info(\r\n            f\"申请上传成功, 总计大小{resp_data.size // 1024}KB, {self.__clips}分片, 分片大小{resp_data.per_size // 1024}KB: {self.__in_boss_key}\"\r\n        )\r\n        self.__upload_part()\r\n        self.__commit_upload()\r\n\r\n    def __upload_part(self) -> None:\r\n        \"上传音频数据\"\r\n        for clip in range(self.__clips):\r\n            start_range = clip * self.__per_size\r\n            end_range = (clip + 1) * self.__per_size\r\n            logging.info(f\"开始上传分片{clip}: {start_range}-{end_range}\")\r\n            resp = self.session.put(\r\n                self.__upload_urls[clip],\r\n                data=self.sound_bin[start_range:end_range],\r\n            )\r\n            resp.raise_for_status()\r\n            etag = resp.headers.get(\"Etag\")\r\n            self.__etags.append(etag)\r\n            logging.info(f\"分片{clip}上传成功: {etag}\")\r\n\r\n    def __commit_upload(self) -> None:\r\n        \"提交上传数据\"\r\n        resp = self.session.post(\r\n            API_COMMIT_UPLOAD,\r\n            data={\r\n                \"in_boss_key\": self.__in_boss_key,\r\n                \"resource_id\": self.__resource_id,\r\n                \"etags\": \",\".join(self.__etags),\r\n                \"upload_id\": self.__upload_id,\r\n                \"model_id\": 7,\r\n            },\r\n        )\r\n        resp.raise_for_status()\r\n        resp = resp.json()\r\n        code = resp[\"code\"]\r\n        if code:\r\n            raise APIError(code, resp[\"message\"])\r\n        resp_data = ResourceCompleteRspSchema.model_validate(resp[\"data\"])\r\n        self.__download_url = resp_data.download_url\r\n        logging.info(f\"提交成功\")\r\n\r\n    def create_task(self) -> str:\r\n        \"开始创建转换任务\"\r\n        resp = self.session.post(\r\n            API_CREATE_TASK, json={\"resource\": self.__download_url, \"model_id\": \"7\"}\r\n        )\r\n        resp.raise_for_status()\r\n        resp = resp.json()\r\n        code = resp[\"code\"]\r\n        if code:\r\n            raise APIError(code, resp[\"message\"])\r\n        resp_data = TaskCreateRspSchema.model_validate(resp[\"data\"])\r\n        self.task_id = resp_data.task_id\r\n        logging.info(f\"任务已创建: {self.task_id}\")\r\n        return self.task_id\r\n\r\n    def result(self, task_id: Optional[str] = None) -> ResultRspSchema:\r\n        \"查询转换结果\"\r\n        resp = self.session.get(\r\n            API_QUERY_RESULT, params={\"model_id\": 7, \"task_id\": task_id or self.task_id}\r\n        )\r\n        resp.raise_for_status()\r\n        resp = resp.json()\r\n        code = resp[\"code\"]\r\n        if code:\r\n            raise APIError(code, resp[\"message\"])\r\n        return ResultRspSchema.model_validate(resp[\"data\"])\r\n"
  },
  {
    "path": "bcut_asr/__main__.py",
    "content": "import logging\r\nimport sys\r\nimport time\r\nfrom argparse import ArgumentParser, FileType\r\n\r\nimport ffmpeg\r\n\r\nfrom . import APIError, BcutASR, ResultStateEnum\r\n\r\nlogging.basicConfig(\r\n    format=\"%(asctime)s - [%(levelname)s] %(message)s\",\r\n    level=logging.INFO,\r\n)\r\n\r\nINFILE_FMT = [\"flac\", \"aac\", \"m4a\", \"mp3\", \"wav\"]\r\nOUTFILE_FMT = [\"srt\", \"json\", \"lrc\", \"txt\"]\r\n\r\nparser = ArgumentParser(\r\n    prog=\"bcut-asr\",\r\n    description=\"必剪语音识别\\n\",\r\n    epilog=f\"支持输入音频格式: {', '.join(INFILE_FMT)}  支持自动调用ffmpeg提取视频伴音\",\r\n)\r\nparser.add_argument(\r\n    \"-f\", \"--format\", nargs=\"?\", default=\"srt\", choices=OUTFILE_FMT, help=\"输出字幕格式\"\r\n)\r\nparser.add_argument(\r\n    \"-i\",\r\n    \"--interval\",\r\n    nargs=\"?\",\r\n    type=float,\r\n    default=\"1.0\",\r\n    metavar=\"1.0\",\r\n    help=\"任务状态轮询间隔(秒)\",\r\n)\r\nparser.add_argument(\"input\", type=FileType(\"rb\"), help=\"输入媒体文件\")\r\nparser.add_argument(\r\n    \"output\",\r\n    nargs=\"?\",\r\n    type=FileType(\"w\", encoding=\"utf8\"),\r\n    help=\"输出字幕文件, 可stdout\",\r\n)\r\n\r\n\r\ndef ffmpeg_render(media_file: str) -> bytes:\r\n    \"提取视频伴音并转码为aac格式\"\r\n    out, err = (\r\n        ffmpeg.input(media_file, v=\"warning\")\r\n        .output(\"pipe:\", ac=1, format=\"adts\")\r\n        .run(capture_stdout=True)\r\n    )\r\n    return out\r\n\r\n\r\ndef main():\r\n    # 处理输入文件情况\r\n    args = parser.parse_args()\r\n    infile = args.input\r\n    infile_name = infile.name\r\n    if infile_name == \"<stdin>\":\r\n        logging.error(\"输入文件错误\")\r\n        return -1\r\n    suffix = infile_name.rsplit(\".\", 1)[-1]\r\n    if suffix in INFILE_FMT:\r\n        infile_fmt = suffix\r\n        infile_data = infile.read()\r\n    else:\r\n        # ffmpeg分离视频伴音\r\n        logging.info(\"非标准音频文件, 尝试调用ffmpeg转码\")\r\n        try:\r\n            infile_data = ffmpeg_render(infile_name)\r\n        except ffmpeg.Error:\r\n            logging.error(\"ffmpeg转码失败\")\r\n            return -1\r\n        else:\r\n            logging.info(\"ffmpeg转码完成\")\r\n            infile_fmt = \"aac\"\r\n\r\n    # 处理输出文件情况\r\n    outfile = args.output\r\n    if outfile is None:\r\n        # 未指定输出文件，默认为文件名同输入，可以 -t 传参，默认str格式\r\n        if args.format is not None:\r\n            outfile_fmt = args.format\r\n        else:\r\n            outfile_fmt = \"srt\"\r\n    else:\r\n        # 指定输出文件\r\n        outfile_name = outfile.name\r\n        if outfile.name == \"<stdout>\":\r\n            # stdout情况，可以 -t 传参，默认str格式\r\n            if args.format is not None:\r\n                outfile_fmt = args.format\r\n            else:\r\n                outfile_fmt = \"srt\"\r\n        else:\r\n            suffix = outfile_name.rsplit(\".\", 1)[-1]\r\n            if suffix in OUTFILE_FMT:\r\n                outfile_fmt = suffix\r\n            else:\r\n                logging.error(\"输出格式错误\")\r\n                return -1\r\n\r\n    interval = args.interval\r\n    if interval is None:\r\n        interval = 1.0\r\n    # 开始执行转换逻辑\r\n    asr = BcutASR()\r\n    asr.set_data(raw_data=infile_data, data_fmt=infile_fmt)\r\n    try:\r\n        # 上传文件\r\n        asr.upload()\r\n        # 创建任务\r\n        task_id = asr.create_task()\r\n        while True:\r\n            # 轮询检查任务状态\r\n            task_resp = asr.result()\r\n            match task_resp.state:\r\n                case ResultStateEnum.STOP:\r\n                    logging.info(f\"等待识别开始\")\r\n                case ResultStateEnum.RUNING:\r\n                    logging.info(f\"识别中 {task_resp.remark}\")\r\n                case ResultStateEnum.ERROR:\r\n                    logging.error(f\"识别失败 {task_resp.remark}\")\r\n                    sys.exit(-1)\r\n                case ResultStateEnum.COMPLETE:\r\n                    logging.info(f\"识别成功\")\r\n                    outfile_name = f\"{infile_name.rsplit('.', 1)[-2]}.{outfile_fmt}\"\r\n                    outfile = open(outfile_name, \"w\", encoding=\"utf8\")\r\n                    # 识别成功, 回读字幕数据\r\n                    result = task_resp.parse()\r\n                    break\r\n            time.sleep(interval)\r\n        if not result.has_data():\r\n            logging.error(\"未识别到语音\")\r\n            return -1\r\n        match outfile_fmt:\r\n            case \"srt\":\r\n                outfile.write(result.to_srt())\r\n            case \"lrc\":\r\n                outfile.write(result.to_lrc())\r\n            case \"json\":\r\n                outfile.write(result.model_dump_json())\r\n            case \"txt\":\r\n                outfile.write(result.to_txt())\r\n        outfile.close()\r\n        logging.info(f\"转换成功: {outfile_name}\")\r\n    except APIError as err:\r\n        logging.error(f\"接口错误: {err.__str__()}\")\r\n        return -1\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    sys.exit(main())\r\n"
  },
  {
    "path": "bcut_asr/orm.py",
    "content": "from enum import Enum\r\nfrom typing import Optional\r\nfrom pydantic import BaseModel\r\n\r\n\r\nclass ASRDataSeg(BaseModel):\r\n    \"\"\"文字识别-断句\"\"\"\r\n\r\n    class ASRDataWords(BaseModel):\r\n        \"\"\"文字识别-逐字\"\"\"\r\n\r\n        label: str\r\n        start_time: int\r\n        end_time: int\r\n\r\n    start_time: int\r\n    end_time: int\r\n    transcript: str\r\n    words: list[ASRDataWords]\r\n\r\n    def to_srt_ts(self) -> str:\r\n        \"\"\"转换为srt时间戳\"\"\"\r\n\r\n        def _conv(ms: int) -> tuple[int, int, int, int]:\r\n            return ms // 3600000, ms // 60000 % 60, ms // 1000 % 60, ms % 1000\r\n\r\n        s_h, s_m, s_s, s_ms = _conv(self.start_time)\r\n        e_h, e_m, e_s, e_ms = _conv(self.end_time)\r\n        return f\"{s_h:02d}:{s_m:02d}:{s_s:02d},{s_ms:03d} --> {e_h:02d}:{e_m:02d}:{e_s:02d},{e_ms:03d}\"\r\n\r\n    def to_lrc_ts(self) -> str:\r\n        \"\"\"转换为lrc时间戳\"\"\"\r\n\r\n        def _conv(ms: int) -> tuple[int, int, int]:\r\n            return ms // 60000, ms // 1000 % 60, ms % 1000 // 10\r\n\r\n        s_m, s_s, s_ms = _conv(self.start_time)\r\n        return f\"[{s_m:02d}:{s_s:02d}.{s_ms:02d}]\"\r\n\r\n\r\nclass ASRData(BaseModel):\r\n    \"\"\"语音识别结果\"\"\"\r\n\r\n    utterances: list[ASRDataSeg]\r\n    version: str\r\n\r\n    def __iter__(self):\r\n        return iter(self.utterances)\r\n\r\n    def has_data(self) -> bool:\r\n        \"\"\"是否识别到数据\"\"\"\r\n        return len(self.utterances) > 0\r\n\r\n    def to_txt(self) -> str:\r\n        \"\"\"转成 txt 格式字幕 (无时间标记)\"\"\"\r\n        return \"\\n\".join(seg.transcript for seg in self.utterances)\r\n\r\n    def to_srt(self) -> str:\r\n        \"\"\"转成 srt 格式字幕\"\"\"\r\n        return \"\\n\".join(\r\n            f\"{n}\\n{seg.to_srt_ts()}\\n{seg.transcript}\\n\"\r\n            for n, seg in enumerate(self.utterances, 1)\r\n        )\r\n\r\n    def to_lrc(self) -> str:\r\n        \"\"\"转成 lrc 格式字幕\"\"\"\r\n        return \"\\n\".join(\r\n            f\"{seg.to_lrc_ts()}{seg.transcript}\" for seg in self.utterances\r\n        )\r\n\r\n    def to_ass(self) -> str:\r\n        \"\"\"转换为 ass 格式\"\"\"\r\n        # TODO: ass 序列化实现\r\n        raise NotImplementedError\r\n\r\n\r\nclass ResourceCreateRspSchema(BaseModel):\r\n    \"\"\"上传申请响应\"\"\"\r\n\r\n    resource_id: str\r\n    title: str\r\n    type: int\r\n    in_boss_key: str\r\n    size: int\r\n    upload_urls: list[str]\r\n    upload_id: str\r\n    per_size: int\r\n\r\n\r\nclass ResourceCompleteRspSchema(BaseModel):\r\n    \"\"\"上传提交响应\"\"\"\r\n\r\n    resource_id: str\r\n    download_url: str\r\n\r\n\r\nclass TaskCreateRspSchema(BaseModel):\r\n    \"\"\"任务创建响应\"\"\"\r\n\r\n    resource: str\r\n    result: str\r\n    task_id: str  # 任务id\r\n\r\n\r\nclass ResultStateEnum(Enum):\r\n    \"\"\"任务状态枚举\"\"\"\r\n\r\n    STOP = 0  # 未开始\r\n    RUNING = 1  # 运行中\r\n    ERROR = 3  # 错误\r\n    COMPLETE = 4  # 完成\r\n\r\n\r\nclass ResultRspSchema(BaseModel):\r\n    \"\"\"任务结果查询响应\"\"\"\r\n\r\n    task_id: str  # 任务id\r\n    result: Optional[str] = None  # 结果数据-json 在 state 1 的情况为 None\r\n    remark: str  # 任务状态详情\r\n    state: ResultStateEnum  # 任务状态\r\n\r\n    def parse(self) -> ASRData:\r\n        \"解析结果数据\"\r\n        return ASRData.model_validate_json(self.result)\r\n"
  },
  {
    "path": "pyproject.toml",
    "content": "[tool.poetry]\r\nname = \"bcut-asr\"\r\nversion = \"0.0.3\"\r\ndescription = \"使用必剪API的语音字幕识别\"\r\nauthors = [\"SocialSisterYi <1440239038@qq.com>\"]\r\nlicense = \"MIT License\"\r\nreadme = \"README.md\"\r\n\r\n[tool.poetry.dependencies]\r\npython = \">=3.10\"\r\nrequests = \"^2.31.0\"\r\npydantic = \"^2.7.0\"\r\nffmpeg-python = \"^0.2.0\"\r\n\r\n[tool.poetry.scripts]\r\nbcut-asr = \"bcut_asr.__main__:main\"\r\n\r\n[tool.poetry.group.dev.dependencies]\r\nblack = \"^24.4.0\"\r\n\r\n[build-system]\r\nrequires = [\"poetry-core\"]\r\nbuild-backend = \"poetry.core.masonry.api\"\r\n"
  }
]